17,921 research outputs found
Multimodal Machine Learning for Automated ICD Coding
This study presents a multimodal machine learning model to predict ICD-10
diagnostic codes. We developed separate machine learning models that can handle
data from different modalities, including unstructured text, semi-structured
text and structured tabular data. We further employed an ensemble method to
integrate all modality-specific models to generate ICD-10 codes. Key evidence
was also extracted to make our prediction more convincing and explainable. We
used the Medical Information Mart for Intensive Care III (MIMIC -III) dataset
to validate our approach. For ICD code prediction, our best-performing model
(micro-F1 = 0.7633, micro-AUC = 0.9541) significantly outperforms other
baseline models including TF-IDF (micro-F1 = 0.6721, micro-AUC = 0.7879) and
Text-CNN model (micro-F1 = 0.6569, micro-AUC = 0.9235). For interpretability,
our approach achieves a Jaccard Similarity Coefficient (JSC) of 0.1806 on text
data and 0.3105 on tabular data, where well-trained physicians achieve 0.2780
and 0.5002 respectively.Comment: Machine Learning for Healthcare 201
Program on Earth Observation Data Management Systems (EODMS), appendixes
The needs of state, regional, and local agencies involved in natural resources management in Illinois, Iowa, Minnesota, Missouri, and Wisconsin are investigated to determine the design of satellite remotely sensed derivable information products. It is concluded that an operational Earth Observation Data Management System (EODMS) will be most beneficial if it provides a full range of services - from raw data acquisition to interpretation and dissemination of final information products. Included is a cost and performance analysis of alternative processing centers, and an assessment of the impacts of policy, regulation, and government structure on implementing large scale use of remote sensing technology in this community of users
Lightweight Multilingual Software Analysis
Developer preferences, language capabilities and the persistence of older
languages contribute to the trend that large software codebases are often
multilingual, that is, written in more than one computer language. While
developers can leverage monolingual software development tools to build
software components, companies are faced with the problem of managing the
resultant large, multilingual codebases to address issues with security,
efficiency, and quality metrics. The key challenge is to address the opaque
nature of the language interoperability interface: one language calling
procedures in a second (which may call a third, or even back to the first),
resulting in a potentially tangled, inefficient and insecure codebase. An
architecture is proposed for lightweight static analysis of large multilingual
codebases: the MLSA architecture. Its modular and table-oriented structure
addresses the open-ended nature of multiple languages and language
interoperability APIs. We focus here as an application on the construction of
call-graphs that capture both inter-language and intra-language calls. The
algorithms for extracting multilingual call-graphs from codebases are
presented, and several examples of multilingual software engineering analysis
are discussed. The state of the implementation and testing of MLSA is
presented, and the implications for future work are discussed.Comment: 15 page
Local Rule-Based Explanations of Black Box Decision Systems
The recent years have witnessed the rise of accurate but obscure decision
systems which hide the logic of their internal decision processes to the users.
The lack of explanations for the decisions of black box systems is a key
ethical issue, and a limitation to the adoption of machine learning components
in socially sensitive and safety-critical contexts. %Therefore, we need
explanations that reveals the reasons why a predictor takes a certain decision.
In this paper we focus on the problem of black box outcome explanation, i.e.,
explaining the reasons of the decision taken on a specific instance. We propose
LORE, an agnostic method able to provide interpretable and faithful
explanations. LORE first leans a local interpretable predictor on a synthetic
neighborhood generated by a genetic algorithm. Then it derives from the logic
of the local interpretable predictor a meaningful explanation consisting of: a
decision rule, which explains the reasons of the decision; and a set of
counterfactual rules, suggesting the changes in the instance's features that
lead to a different outcome. Wide experiments show that LORE outperforms
existing methods and baselines both in the quality of explanations and in the
accuracy in mimicking the black box
A review of GIS-based information sharing systems
GIS-based information sharing systems have been implemented in many of England and Wales' Crime and Disorder Reduction Partnerships (CDRPs). The information sharing role of these systems is seen as being vital to help in the review of crime, disorder and misuse of drugs; to sustain strategic objectives, to monitor interventions and initiatives; and support action plans for service delivery. This evaluation into these systems aimed to identify the lessons learned from existing systems, identify how these systems can be best used to support the business functions of CDRPs, identify common weaknesses across the systems, and produce guidelines on how these systems should be further developed. At present there are in excess of 20 major systems distributed across England and Wales. This evaluation considered a representative sample of ten systems. To date, little documented evidence has been collected by the systems that demonstrate the direct impact they are having in reducing crime and disorder, and the misuse of drugs. All point to how they are contributing to more effective partnership working, but all systems must be encouraged to record how they are contributing to improving community safety. Demonstrating this impact will help them to assure their future role in their CDRPs. By reviewing the systems wholly, several key ingredients were identified that were evident in contributing to the effectiveness of these systems. These included the need for an effective partnership business model within which the system operates, and the generation of good quality multi-agency intelligence products from the system. In helping to determine the future development of GIS-based information sharing systems, four key community safety partnership business service functions have been identified that these systems can most effectively support. These functions support the performance review requirements of CDRPs, operate a problem solving scanning and analysis role, and offer an interface with the public. By following these business service functions as a template will provide for a more effective application of these systems nationally
- …