Search CORE

16 research outputs found

Predicting the Effectiveness of Self-Training: Application to Sentiment Classification

Author: Daelemans Walter
Van Asch Vincent
Publication venue
Publication date: 01/01/2016
Field of study

The goal of this paper is to investigate the connection between the performance gain that can be obtained by selftraining and the similarity between the corpora used in this approach. Self-training is a semi-supervised technique designed to increase the performance of machine learning algorithms by automatically classifying instances of a task and adding these as additional training material to the same classifier. In the context of language processing tasks, this training material is mostly an (annotated) corpus. Unfortunately self-training does not always lead to a performance increase and whether it will is largely unpredictable. We show that the similarity between corpora can be used to identify those setups for which self-training can be beneficial. We consider this research as a step in the process of developing a classifier that is able to adapt itself to each new test corpus that it is presented with

arXiv.org e-Print Archive

Institutional Repository Universiteit Antwerpen

Word-Graph Construction Techniques for Context Analysis

Author: Atif Nazma
Mushtaq Muhammad Umer
Rafique Yasir
Wu Jue
Publication venue: Logical Creations Education Research Institute
Publication date: 06/01/2024
Field of study

A Nomo-Word Graph Construction Analysis Method (NWGC-AM) is used to graph let the corresponding construction phrases into essential and non-essential citation groups. NMCS-NR, or Nomo Maximum Common Sub-graph edge resemblance, Maximum Common Subgraph Directed Edge resemblance (MCS-DER), and Maximum Common Subgraph Resemblance. The graph resemblance metrics used in this work are called Undirected Edges Resemblance (MCS-UER). The tests included five distinct classifiers: Random Forest, Naive Bayes, K-Nearest Neighbors (KNN), Decision Trees, and Support Vector Machines (SVM).Four sixty one (361) citations made up the annotated dataset used for the studies.  The Decision Tree classifier exhibits superior performance, attaining an accuracy rate of 0.98

LC International Journal of STEM (ISSN: 2708-7123)

Annotated Corpus for Citation Context Analysis

Author: Gómez Soriano José
Hernández-Álvarez Myriam
Martínez-Barco Patricio
Publication venue: 'Escuela Politecnica Nacional'
Publication date: 01/01/2016
Field of study

In this paper, we present a corpus composed of 85 scientific articles annotated with 2092 citations analyzed using context analysis. We obtained a high Inter-annotator agreement; therefore, we assure reliability and reproducibility of the annotation performed by three coders in an independent way. We applied this corpus to classify citations according to qualitative criteria using a medium granularity categorization scheme enriched by annotated keywords and labels to obtain high granularity. The annotation schema handle three dimensions: PURPOSE: POLARITY: ASPECTS. Citation purpose define functions classification: use, critique, comparison and background with more specific classes stablished using keywords: Based on, Supply; Useful; Contrast; Acknowledge, Corroboration, Debate; Weakness and Hedges. Citation aspects complement the citation characterization: concept, method, data, tool, task, among others. Polarity has three levels: Positive, Negative and Neutral. We developed the schema and annotated the corpus focusing in applications for citation influence assessment, but we suggest that applications as summary generation and information retrieval also could use this annotated corpus because of the organization of the scheme in clearly defined general dimensions

Latin American Journal of Computing

Repositorio Institucional de la Universidad de Alicante

LAReferencia - Red Federada de Repositorios Institucionales de Publicaciones Científicas Latinoamericanas

Directory of Open Access Journals

NEUROSURGERY ENTHUSIASTIC WOMEN SOCIETY

Disciplinary Difference in Citation Opinion Expressions

Author: Yu Bei
Zhang Feifei
Publication venue: 'iSchools'
Publication date: 15/03/2015
Field of study

This study examines academic opinion expressions in citation context. We first developed an annotation schema to annotate three aspects of each academic opinion expressed in a citation statement: rhetorical purpose, content aspect, and opinion polarity. We then annotated two samples: a natural science sample consisting of biomedical journal articles, and an engineering sample consisting of conference papers in the natural language processing field. A comparison of the annotations on the two samples showed disciplinary differences in citation opinion expressions. The result contributes to the understanding of academic opinion expressions in citation context and the development of automated citation opinion analysis tools to assist researchers' literature search and navigation.ye

Illinois Digital Environment for Access to Learning and Scholarship Repository

CORWA: A Citation-Oriented Related Work Annotation Dataset

Author: Li Xiangci
Mandal Biswadip
Ouyang Jessica
Publication venue
Publication date: 06/05/2022
Field of study

Academic research is an exploratory activity to discover new solutions to problems. By this nature, academic research works perform literature reviews to distinguish their novelties from prior work. In natural language processing, this literature review is usually conducted under the "Related Work" section. The task of related work generation aims to automatically generate the related work section given the rest of the research paper and a list of papers to cite. Prior work on this task has focused on the sentence as the basic unit of generation, neglecting the fact that related work sections consist of variable length text fragments derived from different information sources. As a first step toward a linguistically-motivated related work generation framework, we present a Citation Oriented Related Work Annotation (CORWA) dataset that labels different types of citation text fragments from different information sources. We train a strong baseline model that automatically tags the CORWA labels on massive unlabeled related work section texts. We further suggest a novel framework for human-in-the-loop, iterative, abstractive related work generation.Comment: Accepted by NAACL 202

arXiv.org e-Print Archive

Extracting the Evolutionary Backbone of Scientific Domains: The Semantic Main Path Network Approach Based on Citation Context Analysis

Author: Jiang Xiaorui
Liu Junjun
Publication venue
Publication date: 01/08/2022
Field of study

Coventry University Pure Portal

A Correlation Study of Co-opinion and Co-citation Similarity Measures

Author: Fakhrahmad Seyed Mostafa
Mirzabeigi Mahdieh
Mohammadi Mehdi
Sotudeh Hajar
Yaghtin Maryam
Publication venue: Regional Information Center for Science & Technology
Publication date: 30/07/2019
Field of study

Co-citation forms a relational document network. Co-citation-based measures are found to be effective in retrieving relevant documents. However, they are far from ideal and need further enhancements. Co-opinion concept was proposed and tested in previous research and found to be effective in retrieving relevant documents. The present study endeavors to explore the correlation between opinion (dis)similarity measures and the traditional co-citation-based ones including Citation Proximity Index (CPI), co-citedness and co-citation context similarity. The results show significant, though weak to medium, correlations between the variables. The correlations are direct for co-opinion measure, while being inverse for the opinion distance. Accordingly, the two groups of measures are revealed to represent some similar aspects of the document relation. Moreover, the weakness of the correlations implies that there are different dimensions represented by the two group

International Journal of Information Science and Management (IJISM)