Search CORE

10,789 research outputs found

Developing a PPM based named entity recognition system for geo-located searching on the Web

Author: Bold Kieran
Publication venue
Publication date: 01/01/2023
Field of study

PELP: Pioneer Event Log Prediction Using Sequence-to-Sequence Neural Networks

Author: Bailey James
Polyvyanyy Artem
Zhou Wenjun
Publication venue
Publication date: 15/12/2023
Field of study

Process mining, a data-driven approach for analyzing, visualizing, and improving business processes using event logs, has emerged as a powerful technique in the field of business process management. Process forecasting is a sub-field of process mining that studies how to predict future processes and process models. In this paper, we introduce and motivate the problem of event log prediction and present our approach to solving the event log prediction problem, in particular, using the sequence-to-sequence deep learning approach. We evaluate and analyze the prediction outcomes on a variety of synthetic logs and seven real-life logs and show that our approach can generate perfect predictions on synthetic logs and that deep learning techniques have the potential to be applied in real-world event log prediction tasks. We further provide practical recommendations for event log predictions grounded in the outcomes of the conducted experiments.Comment: CAiSE 2024 submissio

arXiv.org e-Print Archive

A compression based toolkit for text processing

Author: Teahan William
Publication venue
Publication date: 10/04/2018
Field of study

Bangor University Research Portal

Ground-based detection of the near-infrared emission from the dayside of WASP-5b

Author: Chen Guo
Henning Thomas
Madhusudhan Nikku
Nikolov Nikolay
Seemann Ulf
van Boekel Roy
Wang Hongchi
Publication venue: 'EDP Sciences'
Publication date: 03/03/2014
Field of study

(Abridged) WASP-5b is a highly irradiated dense hot Jupiter orbiting a G4V star every 1.6 days. We observed two secondary eclipses of WASP-5b in the J, H and K bands simultaneously. Thermal emission of WASP-5b is detected in the J and K bands. The retrieved planet-to-star flux ratios in the J and K bands are 0.168 +0.050/-0.052% and 0.269+/-0.062%, corresponding to brightness temperatures of 2996 +212/-261K and 2890 +246/-269K, respectively. No thermal emission is detected in the H band, with a 3-sigma upper limit of 0.166%, corresponding to a maximum temperature of 2779K. On the whole, our J, H, K results can be explained by a roughly isothermal temperature profile of ~2700K in the deep layers of the planetary dayside atmosphere that are probed at these wavelengths. Together with Spitzer observations, which probe higher layers that are found to be at ~1900K, a temperature inversion is ruled out in the range of pressures probed by the combined data set. While an oxygen-rich model is unable to explain all the data, a carbon-rich model provides a reasonable fit but violates energy balance.Comment: 13 pages, 9 figures, accepted for publication in A&

arXiv.org e-Print Archive

Crossref

EDP Sciences OAI-PMH repository (1.2.0)

Distance Guided Channel Weighting for Semantic Segmentation

Author: Liu Xuanyi
Luo Li
Zhu Lanyun
Zhu Shiping
Publication venue
Publication date: 05/05/2020
Field of study

Recent works have achieved great success in improving the performance of multiple computer vision tasks by capturing features with a high channel number utilizing deep neural networks. However, many channels of extracted features are not discriminative and contain a lot of redundant information. In this paper, we address above issue by introducing the Distance Guided Channel Weighting (DGCW) Module. The DGCW module is constructed in a pixel-wise context extraction manner, which enhances the discriminativeness of features by weighting different channels of each pixel's feature vector when modeling its relationship with other pixels. It can make full use of the high-discriminative information while ignore the low-discriminative information containing in feature maps, as well as capture the long-range dependencies. Furthermore, by incorporating the DGCW module with a baseline segmentation network, we propose the Distance Guided Channel Weighting Network (DGCWNet). We conduct extensive experiments to demonstrate the effectiveness of DGCWNet. In particular, it achieves 81.6% mIoU on Cityscapes with only fine annotated data for training, and also gains satisfactory performance on another two semantic segmentation datasets, i.e. Pascal Context and ADE20K. Code will be available soon at https://github.com/LanyunZhu/DGCWNet

arXiv.org e-Print Archive

Categorisation of Arabic Twitter Text

Author: Altamimi Mohammed Hamed R
Publication venue
Publication date: 26/02/2020
Field of study

Bangor University Research Portal

Discovering Interesting Behaviours in Complex Systems

Author: Ahmed Nadim
Publication venue
Publication date: 18/11/2019
Field of study

Bangor University Research Portal

Text Augmentation: Inserting markup into natural language text with PPM Models

Author: Yeates Stuart Andrew
Publication venue: The University of Waikato
Publication date: 01/01/2006
Field of study

This thesis describes a new optimisation and new heuristics for automatically marking up XML documents. These are implemented in CEM, using PPMmodels. CEM is significantly more general than previous systems, marking up large numbers of hierarchical tags, using n-gram models for large n and a variety of escape methods. Four corpora are discussed, including the bibliography corpus of 14682 bibliographies laid out in seven standard styles using the BIBTEX system and markedup in XML with every field from the original BIBTEX. Other corpora include the ROCLING Chinese text segmentation corpus, the Computists’ Communique corpus and the Reuters’ corpus. A detailed examination is presented of the methods of evaluating mark up algorithms, including computation complexity measures and correctness measures from the fields of information retrieval, string processing, machine learning and information theory. A new taxonomy of markup complexities is established and the properties of each taxon are examined in relation to the complexity of marked-up documents. The performance of the new heuristics and optimisation is examined using the four corpora

CiteSeerX

Research Commons@Waikato

A Compression-Based Toolkit for Modelling and Processing Natural Language Text

Author: Teahan William
Publication venue: 'MDPI AG'
Publication date: 01/11/2018
Field of study

A novel compression-based toolkit for modelling and processing natural language text is described. The design of the toolkit adopts an encoding perspective—applications are considered to be problems in searching for the best encoding of different transformations of the source text into the target text. This paper describes a two phase ‘noiseless channel model’ architecture that underpins the toolkit which models the text processing as a lossless communication down a noise-free channel. The transformation and encoding that is performed in the first phase must be both lossless and reversible. The role of the verification and decoding second phase is to verify the correctness of the communication of the target text that is produced by the application. This paper argues that this encoding approach has several advantages over the decoding approach of the standard noisy channel model. The concepts abstracted by the toolkit’s design are explained together with details of the library calls. The pseudo-code for a number of algorithms is also described for the applications that the toolkit implements including encoding, decoding, classification, training (model building), parallel sentence alignment, word segmentation and language segmentation. Some experimental results, implementation details, memory usage and execution speeds are also discussed for these applications

Multidisciplinary Digital Publishing Institute

Directory of Open Access Journals

Bangor University Research Portal