1,122 research outputs found
Hierarchical Label-wise Attention Transformer Model for Explainable ICD Coding
International Classification of Diseases (ICD) coding plays an important role
in systematically classifying morbidity and mortality data. In this study, we
propose a hierarchical label-wise attention Transformer model (HiLAT) for the
explainable prediction of ICD codes from clinical documents. HiLAT firstly
fine-tunes a pretrained Transformer model to represent the tokens of clinical
documents. We subsequently employ a two-level hierarchical label-wise attention
mechanism that creates label-specific document representations. These
representations are in turn used by a feed-forward neural network to predict
whether a specific ICD code is assigned to the input clinical document of
interest. We evaluate HiLAT using hospital discharge summaries and their
corresponding ICD-9 codes from the MIMIC-III database. To investigate the
performance of different types of Transformer models, we develop
ClinicalplusXLNet, which conducts continual pretraining from XLNet-Base using
all the MIMIC-III clinical notes. The experiment results show that the F1
scores of the HiLAT+ClinicalplusXLNet outperform the previous state-of-the-art
models for the top-50 most frequent ICD-9 codes from MIMIC-III. Visualisations
of attention weights present a potential explainability tool for checking the
face validity of ICD code predictions
Face Centered Image Analysis Using Saliency and Deep Learning Based Techniques
Image analysis starts with the purpose of configuring vision machines that can perceive like human to intelligently infer general principles and sense the surrounding situations from imagery. This dissertation studies the face centered image analysis as the core problem in high level computer vision research and addresses the problem by tackling three challenging subjects: Are there anything interesting in the image? If there is, what is/are that/they? If there is a person presenting, who is he/she? What kind of expression he/she is performing? Can we know his/her age? Answering these problems results in the saliency-based object detection, deep learning structured objects categorization and recognition, human facial landmark detection and multitask biometrics.
To implement object detection, a three-level saliency detection based on the self-similarity technique (SMAP) is firstly proposed in the work. The first level of SMAP accommodates statistical methods to generate proto-background patches, followed by the second level that implements local contrast computation based on image self-similarity characteristics. At last, the spatial color distribution constraint is considered to realize the saliency detection. The outcome of the algorithm is a full resolution image with highlighted saliency objects and well-defined edges.
In object recognition, the Adaptive Deconvolution Network (ADN) is implemented to categorize the objects extracted from saliency detection. To improve the system performance, L1/2 norm regularized ADN has been proposed and tested in different applications. The results demonstrate the efficiency and significance of the new structure.
To fully understand the facial biometrics related activity contained in the image, the low rank matrix decomposition is introduced to help locate the landmark points on the face images. The natural extension of this work is beneficial in human facial expression recognition and facial feature parsing research.
To facilitate the understanding of the detected facial image, the automatic facial image analysis becomes essential. We present a novel deeply learnt tree-structured face representation to uniformly model the human face with different semantic meanings. We show that the proposed feature yields unified representation in multi-task facial biometrics and the multi-task learning framework is applicable to many other computer vision tasks
- …