Search CORE

77,347 research outputs found

A Frame-Based System for Automatic Classification of Semi- Structured Data

Author: Casanova Marco Antonio
Nunes Bernardo Pereira
Publication venue: 'Universidade Federal do Rio Grande do Sul'
Publication date: 31/03/2010
Field of study

The problem of data classification goes back to the definition oftaxonomies covering knowledge areas. With the advent of the Web, the amount of data available increased several orders of magnitude, making manual data classification impossible. This work presents a tool to automatically classify semi-structured data, represented by frames, without any previous knowledge about structured classes. The tool uses a variation of the K-Medoid algorithm and organizes a set of frames into classes, structured as a strict hierarchy

Em Questao

Archives of the Faculty of Veterinary Medicine UFRGS

Planar Object Tracking in the Wild: A Benchmark

Author: Liang Pengpeng
Liao Chunyuan
Ling Haibin
Lu Hu
Wang Liming
Wu Yifan
Publication venue
Publication date: 22/05/2018
Field of study

Planar object tracking is an actively studied problem in vision-based robotic applications. While several benchmarks have been constructed for evaluating state-of-the-art algorithms, there is a lack of video sequences captured in the wild rather than in constrained laboratory environment. In this paper, we present a carefully designed planar object tracking benchmark containing 210 videos of 30 planar objects sampled in the natural environment. In particular, for each object, we shoot seven videos involving various challenging factors, namely scale change, rotation, perspective distortion, motion blur, occlusion, out-of-view, and unconstrained. The ground truth is carefully annotated semi-manually to ensure the quality. Moreover, eleven state-of-the-art algorithms are evaluated on the benchmark using two evaluation metrics, with detailed analysis provided for the evaluation results. We expect the proposed benchmark to benefit future studies on planar object tracking.Comment: Accepted by ICRA 201

arXiv.org e-Print Archive

Crossref

Information Extraction, Data Integration, and Uncertain Data Management: The State of The Art

Author: Habib Mena B.
Keulen Maurice van
Publication venue: Centre for Telematics and Information Technology, University of Twente
Publication date: 01/01/2011
Field of study

Information Extraction, data Integration, and uncertain data management are different areas of research that got vast focus in the last two decades. Many researches tackled those areas of research individually. However, information extraction systems should have integrated with data integration methods to make use of the extracted information. Handling uncertainty in extraction and integration process is an important issue to enhance the quality of the data in such integrated systems. This article presents the state of the art of the mentioned areas of research and shows the common grounds and how to integrate information extraction and data integration under uncertainty management cover

Maastricht University Research Portal

University of Twente Research Information

Discriminative Segmental Cascades for Feature-Rich Phone Recognition

Author: Gimpel Kevin
Livescu Karen
Tang Hao
Wang Weiran
Publication venue
Publication date: 03/08/2016
Field of study

Discriminative segmental models, such as segmental conditional random fields (SCRFs) and segmental structured support vector machines (SSVMs), have had success in speech recognition via both lattice rescoring and first-pass decoding. However, such models suffer from slow decoding, hampering the use of computationally expensive features, such as segment neural networks or other high-order features. A typical solution is to use approximate decoding, either by beam pruning in a single pass or by beam pruning to generate a lattice followed by a second pass. In this work, we study discriminative segmental models trained with a hinge loss (i.e., segmental structured SVMs). We show that beam search is not suitable for learning rescoring models in this approach, though it gives good approximate decoding performance when the model is already well-trained. Instead, we consider an approach inspired by structured prediction cascades, which use max-marginal pruning to generate lattices. We obtain a high-accuracy phonetic recognition system with several expensive feature types: a segment neural network, a second-order language model, and second-order phone boundary features

arXiv.org e-Print Archive

CiteSeerX

Multimodal Machine Learning for Automated ICD Coding

Author: Band Charlotte
Gao Xin
Lam Mike
MD Ashish K. Khanna
MD Frank Papay
MD Jacek B. Cywinski
MD Kamal Maheshwari
MD Piyush Mathur
Pang Jingzhi
Xie Pengtao
Xing Eric
Xu Keyang
Publication venue
Publication date: 06/08/2019
Field of study

This study presents a multimodal machine learning model to predict ICD-10 diagnostic codes. We developed separate machine learning models that can handle data from different modalities, including unstructured text, semi-structured text and structured tabular data. We further employed an ensemble method to integrate all modality-specific models to generate ICD-10 codes. Key evidence was also extracted to make our prediction more convincing and explainable. We used the Medical Information Mart for Intensive Care III (MIMIC -III) dataset to validate our approach. For ICD code prediction, our best-performing model (micro-F1 = 0.7633, micro-AUC = 0.9541) significantly outperforms other baseline models including TF-IDF (micro-F1 = 0.6721, micro-AUC = 0.7879) and Text-CNN model (micro-F1 = 0.6569, micro-AUC = 0.9235). For interpretability, our approach achieves a Jaccard Similarity Coefficient (JSC) of 0.1806 on text data and 0.3105 on tabular data, where well-trained physicians achieve 0.2780 and 0.5002 respectively.Comment: Machine Learning for Healthcare 201

arXiv.org e-Print Archive