13 research outputs found
Classification on imbalanced data sets, taking advantage of errors to improve performance
Classification methods usually exhibit a poor performance when they are applied on imbalanced data sets. In order to overcome this problem, some algorithms have been proposed in the last decade. Most of them generate synthetic instances in order to balance data sets, regardless the classification algorithm. These methods work reasonably well in most cases; however, they tend to cause over-fitting. In this paper, we propose a method to face the imbalance problem. Our approach, which is very simple to implement, works in two phases; the first one detects instances that are difficult to predict correctly for classification methods. These instances are then categorized into ânoisyâ and âsecureâ, where the former refers to those instances whose most of their nearest neighbors belong to the opposite class. The second phase of our method, consists in generating a number of synthetic instances for each one of those that are difficult to predict correctly. After applying our method to data sets, the AUC area of classifiers is improved dramatically. We compare our method with others of the state-of-the-art, using more than 10 data sets
Multitask prediction of organ dysfunction in the intensive care unit using sequential subnetwork routing.
OBJECTIVE: Multitask learning (MTL) using electronic health records allows concurrent prediction of multiple endpoints. MTL has shown promise in improving model performance and training efficiency; however, it often suffers from negative transfer - impaired learning if tasks are not appropriately selected. We introduce a sequential subnetwork routing (SeqSNR) architecture that uses soft parameter sharing to find related tasks and encourage cross-learning between them. MATERIALS AND METHODS: Using the MIMIC-III (Medical Information Mart for Intensive Care-III) dataset, we train deep neural network models to predict the onset of 6 endpoints including specific organ dysfunctions and general clinical outcomes: acute kidney injury, continuous renal replacement therapy, mechanical ventilation, vasoactive medications, mortality, and length of stay. We compare single-task (ST) models with naive multitask and SeqSNR in terms of discriminative performance and label efficiency. RESULTS: SeqSNR showed a modest yet statistically significant performance boost across 4 of 6 tasks compared with ST and naive multitasking. When the size of the training dataset was reduced for a given task (label efficiency), SeqSNR outperformed ST for all cases showing an average area under the precision-recall curve boost of 2.1%, 2.9%, and 2.1% for tasks using 1%, 5%, and 10% of labels, respectively. CONCLUSIONS: The SeqSNR architecture shows superior label efficiency compared with ST and naive multitasking, suggesting utility in scenarios in which endpoint labels are difficult to ascertain
Applying machine learning to automated segmentation of head and neck tumour volumes and organs at risk on radiotherapy planning CT and MRI scans
Radiotherapy is one of the main ways head and neck cancers are treated;
radiation is used to kill cancerous cells and prevent their recurrence.
Complex treatment planning is required to ensure that enough radiation is given
to the tumour, and little to other sensitive structures (known as organs at risk)
such as the eyes and nerves which might otherwise be damaged. This is
especially difficult in the head and neck, where multiple at-risk structures often
lie in extremely close proximity to the tumour. It can take radiotherapy experts
four hours or more to pick out the important areas on planning scans (known as
segmentation).
This research will focus on applying machine learning algorithms to automatic
segmentation of head and neck planning computed tomography (CT) and
magnetic resonance imaging (MRI) scans at University College London
Hospital NHS Foundation Trust patients. Through analysis of the images used
in radiotherapy DeepMind Health will investigate improvements in efficiency of
cancer treatment pathways
Automated analysis of retinal imaging using machine learning techniques for computer vision
There are almost two million people in the United Kingdom living with sight loss, including around 360,000 people who are registered as blind or partially sighted. Sight threatening diseases, such as diabetic retinopathy and age related macular degeneration have contributed to the 40% increase in outpatient attendances in the last decade but are amenable to early detection and monitoring. With early and appropriate intervention, blindness may be prevented in many cases.
Ophthalmic imaging provides a way to diagnose and objectively assess the progression of a number of pathologies including neovascular (âwetâ) age-related macular degeneration (wet AMD) and diabetic retinopathy. Two methods of imaging are commonly used: digital photographs of the fundus (the âbackâ of the eye) and Optical Coherence Tomography (OCT, a modality that uses light waves in a similar way to how ultrasound uses sound waves). Changes in population demographics and expectations and the changing pattern of chronic diseases creates a rising demand for such imaging. Meanwhile, interrogation of such images is time consuming, costly, and prone to human error. The application of novel analysis methods may provide a solution to these challenges.
This research will focus on applying novel machine learning algorithms to automatic analysis of both digital fundus photographs and OCT in Moorfields Eye Hospital NHS Foundation Trust patients.
Through analysis of the images used in ophthalmology, along with relevant clinical and demographic information, Google DeepMind Health will investigate the feasibility of automated grading of digital fundus photographs and OCT and provide novel quantitative measures for specific disease features and for monitoring the therapeutic success
The signature and cusp geometry of hyperbolic knots
We introduce a new real-valued invariant called the natural slope
of a hyperbolic knot in the 3-sphere, which is defined in terms of its cusp
geometry. We show that twice the knot signature and the natural slope differ
by at most a constant times the hyperbolic volume divided by the cube of
the injectivity radius. This inequality was discovered using machine learning
to detect relationships between various knot invariants. It has applications
to Dehn surgery and to 4-ball genus. We also show a refined version of the
inequality where the upper bound is a linear function of the volume, and the
slope is corrected by terms corresponding to short geodesics that link the knot
an odd number of times
Advancing mathematics by guiding human intuition with AI
The practice of mathematics involves discovering patterns and using these to formulate and prove conjectures, resulting in theorems. Since the 1960s, mathematicians have used computers to assist in the discovery of patterns and formulation of conjectures1, most famously in the Birch and Swinnerton-Dyer conjecture2, a Millennium Prize Problem3. Here we provide examples of new fundamental results in pure mathematics that have been discovered with the assistance of machine learningâdemonstrating a method by which machine learning can aid mathematicians in discovering new conjectures and theorems. We propose a process of using machine learning to discover potential patterns and relations between mathematical objects, understanding them with attribution techniques and using these observations to guide intuition and propose conjectures. We outline this machine-learning-guided framework and demonstrate its successful application to current research questions in distinct areas of pure mathematics, in each case showing how it led to meaningful mathematical contributions on important open problems: a new connection between the algebraic and geometric structure of knots, and a candidate algorithm predicted by the combinatorial invariance conjecture for symmetric groups4. Our work may serve as a model for collaboration between the fields of mathematics and artificial intelligence (AI) that can achieve surprising results by leveraging the respective strengths of mathematicians and machine learning
Developing a reporting guideline for artificial intelligence-centred diagnostic test accuracy studies: the STARD-AI protocol
Introduction Standards for Reporting of Diagnostic Accuracy Study (STARD) was developed to improve the completeness and transparency of reporting in studies investigating diagnostic test accuracy. However, its current form, STARD 2015 does not address the issues and challenges raised by artificial intelligence (AI)-centred interventions. As such, we propose an AI-specific version of the STARD checklist (STARD-AI), which focuses on the reporting of AI diagnostic test accuracy studies. This paper describes the methods that will be used to develop STARD-AI. Methods and analysis The development of the STARD-AI checklist can be distilled into six stages. (1) A project organisation phase has been undertaken, during which a Project Team and a Steering Committee were established; (2) An item generation process has been completed following a literature review, a patient and public involvement and engagement exercise and an online scoping survey of international experts; (3) A three-round modified Delphi consensus methodology is underway, which will culminate in a teleconference consensus meeting of experts; (4) Thereafter, the Project Team will draft the initial STARD-AI checklist and the accompanying documents; (5) A piloting phase among expert users will be undertaken to identify items which are either unclear or missing. This process, consisting of surveys and semistructured interviews, will contribute towards the explanation and elaboration document and (6) On finalisation of the manuscripts, the groupâs efforts turn towards an organised dissemination and implementation strategy to maximise end-user adoption. Ethics and dissemination Ethical approval has been granted by the Joint Research Compliance Office at Imperial College London (reference number: 19IC5679). A dissemination strategy will be aimed towards five groups of stakeholders: (1) academia, (2) policy, (3) guidelines and regulation, (4) industry and (5) public and non-specific stakeholders. We anticipate that dissemination will take place in Q3 of 2021
Applying machine learning to automated segmentation of head and neck tumour volumes and organs at risk on radiotherapy planning CT and MRI scans
Radiotherapy is one of the main ways head and neck cancers are treated;
radiation is used to kill cancerous cells and prevent their recurrence.
Complex treatment planning is required to ensure that enough radiation is given
to the tumour, and little to other sensitive structures (known as organs at risk)
such as the eyes and nerves which might otherwise be damaged. This is
especially difficult in the head and neck, where multiple at-risk structures often
lie in extremely close proximity to the tumour. It can take radiotherapy experts
four hours or more to pick out the important areas on planning scans (known as
segmentation).
This research will focus on applying machine learning algorithms to automatic
segmentation of head and neck planning computed tomography (CT) and
magnetic resonance imaging (MRI) scans at University College London
Hospital NHS Foundation Trust patients. Through analysis of the images used
in radiotherapy DeepMind Health will investigate improvements in efficiency of
cancer treatment pathways