Search CORE

14 research outputs found

Automated Medical Coding on MIMIC-III and MIMIC-IV: A Critical Review and Replicability Study

Author: Borgholt Lasse
Edin Joakim
Havtorn Jakob D.
Junge Alexander
Maaløe Lars
Maistro Maria
Ruotsalo Tuukka
Publication venue: 'Association for Computing Machinery (ACM)'
Publication date: 21/04/2023
Field of study

Medical coding is the task of assigning medical codes to clinical free-text documentation. Healthcare professionals manually assign such codes to track patient diagnoses and treatments. Automated medical coding can considerably alleviate this administrative burden. In this paper, we reproduce, compare, and analyze state-of-the-art automated medical coding machine learning models. We show that several models underperform due to weak configurations, poorly sampled train-test splits, and insufficient evaluation. In previous work, the macro F1 score has been calculated sub-optimally, and our correction doubles it. We contribute a revised model comparison using stratified sampling and identical experimental setups, including hyperparameters and decision boundary tuning. We analyze prediction errors to validate and falsify assumptions of previous works. The analysis confirms that all models struggle with rare codes, while long documents only have a negligible impact. Finally, we present the first comprehensive results on the newly released MIMIC-IV dataset using the reproduced models. We release our code, model parameters, and new MIMIC-III and MIMIC-IV training and evaluation pipelines to accommodate fair future comparisons.Comment: 11 pages, 6 figures, to be published in Proceedings of the 46th International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR '23), July 23--27, 2023, Taipei, Taiwa

arXiv.org e-Print Archive

Tsunami and the Construction of the Disabled Southern Body

Author: Borgholt Lasse
Hovy Dirk
Simonsen Peter
Publication venue: SEPHIS
Publication date: 01/01/2010
Field of study

We investigate (i) whether human annotators can infer ratings from IMDb movie reviews, (ii) how human performance compares to a regression model, and (iii) whether model performance is affected by the rating source\u9d (i.e. author vs. annotator ratings). We collect a data set of IMDb movie reviews with author-provided ratings, and have it re-annotated by crowdsource and expert annotators. Annotators reproduce the original ratings better than a linear regression model, but are off by a large margin in more than 5% of the cases. Models trained on annotator-labeled data outperform those trained on author-labeled data, questioning the usefulness of author-rated reviews serving as labeled data for sentiment analysis

CiteSeerX

Archivio istituzionale della Ricerca - Bocconi

Crossref

Copenhagen University Research Information System

espace@Curtin

Investigating Screen Center Bias and Orbital Reserve as Causes for Central Fixation Bias

Author: Borgholt Lasse
Klerke Sigrid
Simonsen Peter
Publication venue
Publication date: 26/08/2015
Field of study

Copenhagen University Research Information System

Benchmarking Generative Latent Variable Models for Speech

Author: Borgholt Lasse
Frellsen Jes
Hauberg Søren
Havtorn Jakob D.
Maaløe Lars
Publication venue
Publication date: 01/01/2022
Field of study

Stochastic latent variable models (LVMs) achieve state-of-the-art performance on natural image generation but are still inferior to deterministic models on speech. In this paper, we develop a speech benchmark of popular temporal LVMs and compare them against state-of-the-art deterministic models. We report the likelihood, which is a much used metric in the image domain, but rarely, or incomparably, reported for speech models. To assess the quality of the learned representations, we also compare their usefulness for phoneme recognition. Finally, we adapt the Clockwork VAE, a state-of-the-art temporal LVM for video generation, to the speech domain. Despite being autoregressive only in latent space, we find that the Clockwork VAE can outperform previous LVMs and reduce the gap to deterministic models by using a hierarchy of latent variables.Comment: Accepted at the 2022 ICLR workshop on Deep Generative Models for Highly Structured Data (https://deep-gen-struct.github.io

arXiv.org e-Print Archive

Online Research Database In Technology

A Brief Overview of Unsupervised Neural Speech Representation Learning

Author: Borgholt Lasse
Edin Joakim
Havtorn Jakob Drachmann
Igel Christian
Maaløe Lars
Publication venue: 'Association for the Advancement of Artificial Intelligence (AAAI)'
Publication date: 01/01/2022
Field of study

Unsupervised representation learning for speech processing has matured greatly in the last few years. Work in computer vision and natural language processing has paved the way, but speech data offers unique challenges. As a result, methods from other domains rarely translate directly. We review the development of unsupervised representation learning for speech over the last decade. We identify two primary model categories: self-supervised methods and probabilistic latent variable models. We describe the models and develop a comprehensive taxonomy. Finally, we discuss and compare models from the two categories.Comment: The 2nd Workshop on Self-supervised Learning for Audio and Speech Processing (SAS) at AAA

arXiv.org e-Print Archive

Online Research Database In Technology

On scaling contrastive representations for low-resource speech recognition

Author: Borgholt Lasse
Havtorn Jakob D.
Igel Christian
Maaløe Lars
Tax Tycho M.S.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2021
Field of study

Recent advances in self-supervised learning through contrastive training have shown that it is possible to learn a competitive speech recognition system with as little as 10 minutes of labeled data. However, these systems are computationally expensive since they require pre-training followed by fine-tuning in a large parameter space. We explore the performance of such systems without fine-tuning by training a state-of-the-art speech recognizer on the fixed representations from the computationally demanding wav2vec 2.0 framework. We find performance to decrease without fine-tuning and, in the extreme low-resource setting, wav2vec 2.0 is inferior to its predecessor. In addition, we find that wav2vec 2.0 representations live in a low dimensional subspace and that decorrelating the features of the representations can stabilize training of the automatic speech recognizer. Finally, we propose a bidirectional extension to the original wav2vec framework that consistently improves performance.Comment: {\copyright} 2021 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other work

arXiv.org e-Print Archive

Copenhagen University Research Information System

Online Research Database In Technology

Do end-to-end speech recognition models care about context?

Author: Agic Željko
Borgholt Lasse
Havtorn Jakob D.
Igel Christian
Maaløe Lars
Søgaard Anders
Publication venue: 'International Speech Communication Association'
Publication date: 01/01/2020
Field of study

The two most common paradigms for end-to-end speech recognition are connectionist temporal classification (CTC) and attention-based encoder-decoder (AED) models. It has been argued that the latter is better suited for learning an implicit language model. We test this hypothesis by measuring temporal context sensitivity and evaluate how the models perform when we constrain the amount of contextual information in the audio input. We find that the AED model is indeed more context sensitive, but that the gap can be closed by adding self-attention to the CTC model. Furthermore, the two models perform similarly when contextual information is constrained. Finally, in contrast to previous research, our results show that the CTC model is highly competitive on WSJ and LibriSpeech without the help of an external language model.Comment: Published in the proceedings of INTERSPEECH 2020, pp. 4352-435

arXiv.org e-Print Archive

Copenhagen University Research Information System

Online Research Database In Technology

Automated Medical Coding on MIMIC-III and MIMIC-IV:A Critical Review and Replicability Study

Author: Borgholt Lasse
Edin Joakim
Havtorn Jakob D.
Junge Alexander
Maaløe Lars
Maistro Maria
Ruotsalo Tuukka
Publication venue: Association for Computing Machinery
Publication date: 01/01/2023
Field of study

Online Research Database In Technology