Search CORE

7 research outputs found

Visual analytics in FCA-based clustering

Author: Kashnitsky Yury
Publication venue
Publication date: 21/04/2015
Field of study

Visual analytics is a subdomain of data analysis which combines both human and machine analytical abilities and is applied mostly in decision-making and data mining tasks. Triclustering, based on Formal Concept Analysis (FCA), was developed to detect groups of objects with similar properties under similar conditions. It is used in Social Network Analysis (SNA) and is a basis for certain types of recommender systems. The problem of triclustering algorithms is that they do not always produce meaningful clusters. This article describes a specific triclustering algorithm and a prototype of a visual analytics platform for working with obtained clusters. This tool is designed as a testing frameworkis and is intended to help an analyst to grasp the results of triclustering and recommender algorithms, and to make decisions on meaningfulness of certain triclusters and recommendations.Comment: 11 pages, 3 figures, 2 algorithms, 3rd International Conference on Analysis of Images, Social Networks and Texts (AIST'2014). in Supplementary Proceedings of the 3rd International Conference on Analysis of Images, Social Networks and Texts (AIST 2014), Vol. 1197, CEUR-WS.org, 201

arXiv.org e-Print Archive

CiteSeerX

Can FCA-based Recommender System Suggest a Proper Classifier?

Author: Ignatov Dmitry I.
Kashnitsky Yury
Publication venue
Publication date: 21/04/2015
Field of study

The paper briefly introduces multiple classifier systems and describes a new algorithm, which improves classification accuracy by means of recommendation of a proper algorithm to an object classification. This recommendation is done assuming that a classifier is likely to predict the label of the object correctly if it has correctly classified its neighbors. The process of assigning a classifier to each object is based on Formal Concept Analysis. We explain the idea of the algorithm with a toy example and describe our first experiments with real-world datasets.Comment: 10 pages, 1 figure, 4 tables, ECAI 2014, workshop "What FCA can do for "Artifficial Intelligence

arXiv.org e-Print Archive

CiteSeerX

Improving Article Classification with Edge-Heterogeneous Graph Neural Networks

Author: Chamezopoulos Savvas
Kashnitsky Yury
Krzhizhanovskaya Valeria
Ly Khang
Publication venue
Publication date: 20/09/2023
Field of study

Classifying research output into context-specific label taxonomies is a challenging and relevant downstream task, given the volume of existing and newly published articles. We propose a method to enhance the performance of article classification by enriching simple Graph Neural Networks (GNN) pipelines with edge-heterogeneous graph representations. SciBERT is used for node feature generation to capture higher-order semantics within the articles' textual metadata. Fully supervised transductive node classification experiments are conducted on the Open Graph Benchmark (OGB) ogbn-arxiv dataset and the PubMed diabetes dataset, augmented with additional metadata from Microsoft Academic Graph (MAG) and PubMed Central, respectively. The results demonstrate that edge-heterogeneous graphs consistently improve the performance of all GNN models compared to the edge-homogeneous graphs. The transformed data enable simple and shallow GNN pipelines to achieve results on par with more complex architectures. On ogbn-arxiv, we achieve a top-15 result in the OGB competition with a 2-layer GCN (accuracy 74.61%), being the highest-scoring solution with sub-1 million parameters. On PubMed, we closely trail SOTA GNN architectures using a 2-layer GraphSAGE by including additional co-authorship edges in the graph (accuracy 89.88%). The implementation is available at:

\href{https://github.com/lyvykhang/edgehetero-nodeproppred}{\text{https://github.com/lyvykhang/edgehetero-nodeproppred}}

arXiv.org e-Print Archive

Migration data, Russia, 2003-2013

Author: Ilya Kashnitsky (816969)
Nikita Mkrtchyan (2555590)
Yury Kashnitsky (2555560)
Publication venue
Publication date
Field of study

This Excel file contains annual net migration records for Russian regions by 1-year age groups, from 0 to 80, for the periods 2003-2010 and 2011-2013. The first period is defined by the two Russian Censuses (end of 2002 and end of 2010). The second period is limited by the availability of data. Moreover, there was a significant change in the current migration record in 2011; so, the data for the two periods are barely comparable. There are 78 regions , as the data for Moscow and Leningrad regions are merged with the data for the federal cities of Moscow and St.Petersburg, correspondingly. List of data files: IR_0310.csv - inter-regional migration in 2003-2010 IN_0310.csv - international migration in 2003-2010 IR_1113.csv - inter-regional migration in 2011-2013 IN_1113.csv - international migration in 2011-2013 <br

FigShare

Overview of the DagPap22 Shared Task on Detecting Automatically Generated Scientific Papers

Author: de Waard Anita
Fennell Catriona
Herrmannova Drahomira
Kashnitsky Yury
Labbé Cyril
Tsatsaronis Georgios
Publication venue: HAL CCSD
Publication date: 01/10/2022
Field of study

International audienceThis paper provides an overview of the 2022 COLING Scholarly Document Processing workshop shared task on the detection of automatically generated scientific papers. We frame the detection problem as a binary classification task: given an excerpt of text, label it as either human-written or machine-generated. We shared a dataset containing excerpts from human-written papers as well as artificially generated content and suspicious documents collected by Elsevier publishing and editorial teams. As a test set, the participants were provided with a 5x larger corpus of openly accessible human-written as well as generated papers from the same scientific domains of documents. The shared task saw 180 submissions across 14 participating teams and resulted in two published technical reports. We discuss our findings from the shared task in this overview paper

Hal - Université Grenoble Alpes

Evaluating approaches to identifying research supporting the United Nations Sustainable Development Goals

Author: Boonen Finne
Doornenbal Marius
James Chris
Jaworek Robert
Jayabalasingham Bamini
Kang Kevin
Kashnitsky Yury
Keßler Lennart
Labrosse Isabelle
Mu Jingwen
Rivest Maxim
Roberge Guillaume
Vanderfeesten Maurice
Vignes Maéva
Wang Weiwei
Publication venue
Publication date: 10/05/2023
Field of study

The United Nations (UN) Sustainable Development Goals (SDGs) challenge the global community to build a world where no one is left behind. Recognizing that research plays a fundamental part in supporting these goals, attempts have been made to classify research publications according to their relevance in supporting each of the UN's SDGs. In this paper, we outline the methodology that we followed when mapping research articles to SDGs and which is adopted by Times Higher Education in their Social Impact rankings. We compare our solution with other existing queries and models mapping research papers to SDGs. We also discuss various aspects in which the methodology can be improved and generalized to other types of content apart from research articles. The results presented in this paper are the outcome of the SDG Research Mapping Initiative that was established as a partnership between the University of Southern Denmark, the Aurora European Universities Alliance (represented by Vrije Universiteit Amsterdam), the University of Auckland, and Elsevier to bring together broad expertise and share best practices on identifying research contributions to UN's Sustainable Development Goals.Comment: 12 pages, 3 figures, 7 tables, 19 reference

arXiv.org e-Print Archive