Search CORE

5 research outputs found

Segmented Shape-Symbolic Time Series Representation

Author: Biehl Michael
Giotis Ioannis
Kruitbosch Herbert Teun
Publication venue: d-side publishing
Publication date: 01/01/2014
Field of study

ARTS repository - University of Groningen

Segmented Shape-Symbolic Time Series Representation

Author: Biehl Michael
Giotis Ioannis
Kruitbosch Herbert Teun
Publication venue: d-side publishing
Publication date: 01/01/2014
Field of study

Abstract. This paper introduces a symbolic time series representation using monotonic sub-sequences and bottom up segmentation. The representation min-imizes the square error between the segments and their monotonic approximations. The representation can robustly classify the direction of a segment and is scale in-variant with respect to the time and value dimensions. This paper describes two experiments. The first shows how accurately the monotonic functions are able to discriminate between different segments. The second tests how well the segmenta-tion technique recognizes segments and classifies them with correct symbols. Fi-nally this paper illustrates the new representation on real-world data.

CiteSeerX

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Structure-Tags Improve Text Classification for Scholarly Document Quality Prediction

Author: Aedmaa Eleri
Maillette de Buy Wenniger Gideon
Schomaker Lambert
Teun Kruitbosch Herbert
Valentijn Edwin A.
van Dongen Thomas
Publication venue
Publication date: 20/04/2020
Field of study

University of Groningen

Structure-Tags Improve Text Classification for Scholarly Document Quality Prediction

Author: Aedmaa Eleri
Maillette de Buy Wenniger Gideon
Schomaker Lambert
Teun Kruitbosch Herbert
Valentijn Edwin A.
van Dongen Thomas
Publication venue
Publication date: 01/01/2020
Field of study

Training recurrent neural networks on long texts, in particular scholarly documents, causes problems for learning. While hierarchical attention networks (HANs) are effective in solving these problems, they still lose important information about the structure of the text. To tackle these problems, we propose the use of HANs combined with structure-tags which mark the role of sentences in the document. Adding tags to sentences, marking them as corresponding to title, abstract or main body text, yields improvements over the state-of-the-art for scholarly document quality prediction. The proposed system is applied to the task of accept/reject prediction on the PeerRead dataset and compared against a recent BiLSTM-based model and joint textual+visual model as well as against plain HANs. Compared to plain HANs, accuracy increases on all three domains. On the computation and language domain our new model works best overall, and increases accuracy 4.7% over the best literature result. We also obtain improvements when introducing the tags for prediction of the number of citations for 88k scientific publications that we compiled from the Allen AI S2ORC dataset. For our HAN-system with structure-tags we reach 28.5% explained variance, an improvement of 1.8% over our reimplementation of the BiLSTM-based model as well as 1.0% improvement over plain HANs.Comment: This new version of the paper brings the paper up-to-date with the improved paper, published at the First Workshop on Scholarly Document Processing, at EMNLP 2020. .Additionally, minor corrections were made including addition of color to Figures 1,2. The changes in comparison to the first arXiv version are substantial, including various additional results, and substantial improvements to the tex

arXiv.org e-Print Archive

Crossref

Proceedings - University of Groningen

University of Groningen

ARTS repository - University of Groningen

Dissertations of the University of Groningen

Segmented Shape-Symbolic Time Series Representation

Author: Biehl Michael
Giotis Ioannis
Kruitbosch Herbert Teun
Verleysen Michel
Publication venue: d-side publishing
Publication date: 01/01/2014
Field of study