10,789 research outputs found
PELP: Pioneer Event Log Prediction Using Sequence-to-Sequence Neural Networks
Process mining, a data-driven approach for analyzing, visualizing, and
improving business processes using event logs, has emerged as a powerful
technique in the field of business process management. Process forecasting is a
sub-field of process mining that studies how to predict future processes and
process models. In this paper, we introduce and motivate the problem of event
log prediction and present our approach to solving the event log prediction
problem, in particular, using the sequence-to-sequence deep learning approach.
We evaluate and analyze the prediction outcomes on a variety of synthetic logs
and seven real-life logs and show that our approach can generate perfect
predictions on synthetic logs and that deep learning techniques have the
potential to be applied in real-world event log prediction tasks. We further
provide practical recommendations for event log predictions grounded in the
outcomes of the conducted experiments.Comment: CAiSE 2024 submissio
Ground-based detection of the near-infrared emission from the dayside of WASP-5b
(Abridged) WASP-5b is a highly irradiated dense hot Jupiter orbiting a G4V
star every 1.6 days. We observed two secondary eclipses of WASP-5b in the J, H
and K bands simultaneously. Thermal emission of WASP-5b is detected in the J
and K bands. The retrieved planet-to-star flux ratios in the J and K bands are
0.168 +0.050/-0.052% and 0.269+/-0.062%, corresponding to brightness
temperatures of 2996 +212/-261K and 2890 +246/-269K, respectively. No thermal
emission is detected in the H band, with a 3-sigma upper limit of 0.166%,
corresponding to a maximum temperature of 2779K. On the whole, our J, H, K
results can be explained by a roughly isothermal temperature profile of ~2700K
in the deep layers of the planetary dayside atmosphere that are probed at these
wavelengths. Together with Spitzer observations, which probe higher layers that
are found to be at ~1900K, a temperature inversion is ruled out in the range of
pressures probed by the combined data set. While an oxygen-rich model is unable
to explain all the data, a carbon-rich model provides a reasonable fit but
violates energy balance.Comment: 13 pages, 9 figures, accepted for publication in A&
Distance Guided Channel Weighting for Semantic Segmentation
Recent works have achieved great success in improving the performance of
multiple computer vision tasks by capturing features with a high channel number
utilizing deep neural networks. However, many channels of extracted features
are not discriminative and contain a lot of redundant information. In this
paper, we address above issue by introducing the Distance Guided Channel
Weighting (DGCW) Module. The DGCW module is constructed in a pixel-wise context
extraction manner, which enhances the discriminativeness of features by
weighting different channels of each pixel's feature vector when modeling its
relationship with other pixels. It can make full use of the high-discriminative
information while ignore the low-discriminative information containing in
feature maps, as well as capture the long-range dependencies. Furthermore, by
incorporating the DGCW module with a baseline segmentation network, we propose
the Distance Guided Channel Weighting Network (DGCWNet). We conduct extensive
experiments to demonstrate the effectiveness of DGCWNet. In particular, it
achieves 81.6% mIoU on Cityscapes with only fine annotated data for training,
and also gains satisfactory performance on another two semantic segmentation
datasets, i.e. Pascal Context and ADE20K. Code will be available soon at
https://github.com/LanyunZhu/DGCWNet
Text Augmentation: Inserting markup into natural language text with PPM Models
This thesis describes a new optimisation and new heuristics for automatically marking up XML documents. These are implemented in CEM, using PPMmodels. CEM is significantly more general than previous systems, marking up large numbers of hierarchical tags, using n-gram models for large n and a variety of escape methods.
Four corpora are discussed, including the bibliography corpus of 14682 bibliographies laid out in seven standard styles using the BIBTEX system and markedup in XML with every field from the original BIBTEX. Other corpora include the ROCLING Chinese text segmentation corpus, the Computists’ Communique corpus and the Reuters’ corpus. A detailed examination is presented of the methods of evaluating mark up algorithms, including computation complexity measures and correctness measures from the fields of information retrieval, string processing, machine learning and information theory.
A new taxonomy of markup complexities is established and the properties of each taxon are examined in relation to the complexity of marked-up documents. The performance of the new heuristics and optimisation is examined using the four corpora
A Compression-Based Toolkit for Modelling and Processing Natural Language Text
A novel compression-based toolkit for modelling and processing natural language text is described. The design of the toolkit adopts an encoding perspective—applications are considered to be problems in searching for the best encoding of different transformations of the source text into the target text. This paper describes a two phase ‘noiseless channel model’ architecture that underpins the toolkit which models the text processing as a lossless communication down a noise-free channel. The transformation and encoding that is performed in the first phase must be both lossless and reversible. The role of the verification and decoding second phase is to verify the correctness of the communication of the target text that is produced by the application. This paper argues that this encoding approach has several advantages over the decoding approach of the standard noisy channel model. The concepts abstracted by the toolkit’s design are explained together with details of the library calls. The pseudo-code for a number of algorithms is also described for the applications that the toolkit implements including encoding, decoding, classification, training (model building), parallel sentence alignment, word segmentation and language segmentation. Some experimental results, implementation details, memory usage and execution speeds are also discussed for these applications
- …