12 research outputs found
Neural Architecture for Online Ensemble Continual Learning
Continual learning with an increasing number of classes is a challenging
task. The difficulty rises when each example is presented exactly once, which
requires the model to learn online. Recent methods with classic parameter
optimization procedures have been shown to struggle in such setups or have
limitations like non-differentiable components or memory buffers. For this
reason, we present the fully differentiable ensemble method that allows us to
efficiently train an ensemble of neural networks in the end-to-end regime. The
proposed technique achieves SOTA results without a memory buffer and clearly
outperforms the reference methods. The conducted experiments have also shown a
significant increase in the performance for small ensembles, which demonstrates
the capability of obtaining relatively high classification accuracy with a
reduced number of classifiers
Similarity-based Memory Enhanced Joint Entity and Relation Extraction
Document-level joint entity and relation extraction is a challenging
information extraction problem that requires a unified approach where a single
neural network performs four sub-tasks: mention detection, coreference
resolution, entity classification, and relation extraction. Existing methods
often utilize a sequential multi-task learning approach, in which the arbitral
decomposition causes the current task to depend only on the previous one,
missing the possible existence of the more complex relationships between them.
In this paper, we present a multi-task learning framework with bidirectional
memory-like dependency between tasks to address those drawbacks and perform the
joint problem more accurately. Our empirical studies show that the proposed
approach outperforms the existing methods and achieves state-of-the-art results
on the BioCreative V CDR corpus
Classical Out-of-Distribution Detection Methods Benchmark in Text Classification Tasks
State-of-the-art models can perform well in controlled environments, but they
often struggle when presented with out-of-distribution (OOD) examples, making
OOD detection a critical component of NLP systems. In this paper, we focus on
highlighting the limitations of existing approaches to OOD detection in NLP.
Specifically, we evaluated eight OOD detection methods that are easily
integrable into existing NLP systems and require no additional OOD data or
model modifications. One of our contributions is providing a well-structured
research environment that allows for full reproducibility of the results.
Additionally, our analysis shows that existing OOD detection methods for NLP
tasks are not yet sufficiently sensitive to capture all samples characterized
by various types of distributional shifts. Particularly challenging testing
scenarios arise in cases of background shift and randomly shuffled word order
within in domain texts. This highlights the need for future work to develop
more effective OOD detection approaches for the NLP problems, and our work
provides a well-defined foundation for further research in this area.Comment: 11 pages, 3 figures, Association for Computational Linguistic
Domain-Agnostic Neural Architecture for Class Incremental Continual Learning in Document Processing Platform
Production deployments in complex systems require ML architectures to be
highly efficient and usable against multiple tasks. Particularly demanding are
classification problems in which data arrives in a streaming fashion and each
class is presented separately. Recent methods with stochastic gradient learning
have been shown to struggle in such setups or have limitations like memory
buffers, and being restricted to specific domains that disable its usage in
real-world scenarios. For this reason, we present a fully differentiable
architecture based on the Mixture of Experts model, that enables the training
of high-performance classifiers when examples from each class are presented
separately. We conducted exhaustive experiments that proved its applicability
in various domains and ability to learn online in production environments. The
proposed technique achieves SOTA results without a memory buffer and clearly
outperforms the reference methods.Comment: arXiv admin note: text overlap with arXiv:2211.1496
Computer vision-based automated peak picking applied to protein NMR spectra
Motivation: A detailed analysis of multidimensional NMR spectra of macromolecules requires the identification of individual resonances (peaks). This task can be tedious and time-consuming and often requires support by experienced users. Automated peak picking algorithms were introduced more than 25 years ago, but there are still major deficiencies/flaws that often prevent complete and error free peak picking of biological macromolecule spectra. The major challenges of automated peak picking algorithms is both the distinction of artifacts from real peaks particularly from those with irregular shapes and also picking peaks in spectral regions with overlapping resonances which are very hard to resolve by existing computer algorithms. In both of these cases a visual inspection approach could be more effective than a 鈥榖lind' algorithm. Results: We present a novel approach using computer vision (CV) methodology which could be better adapted to the problem of peak recognition. After suitable 鈥榯raining' we successfully applied the CV algorithm to spectra of medium-sized soluble proteins up to molecular weights of 26鈥塳Da and to a 130鈥塳Da complex of a tetrameric membrane protein in detergent micelles. Our CV approach outperforms commonly used programs. With suitable training datasets the application of the presented method can be extended to automated peak picking in multidimensional spectra of nucleic acids or carbohydrates and adapted to solid-state NMR spectra. Availability and implementation: CV-Peak Picker is available upon request from the authors. Contact: [email protected]; [email protected]; [email protected] Supplementary information: Supplementary data are available at Bioinformatics onlin