Search CORE

arXiv.org e-Print Archive

Prediction of future hospital admissions - what is the tradeoff between specificity and accuracy?

Author: Arandjelovic Ognjen
Vasiljeva Ieva
Publication venue
Publication date: 26/02/2016
Field of study

Large amounts of electronic medical records collected by hospitals across the developed world offer unprecedented possibilities for knowledge discovery using computer based data mining and machine learning. Notwithstanding significant research efforts, the use of this data in the prediction of disease development has largely been disappointing. In this paper we examine in detail a recently proposed method which has in preliminary experiments demonstrated highly promising results on real-world data. We scrutinize the authors' claims that the proposed model is scalable and investigate whether the tradeoff between prediction specificity (i.e. the ability of the model to predict a wide number of different ailments) and accuracy (i.e. the ability of the model to make the correct prediction) is practically viable. Our experiments conducted on a data corpus of nearly 3,000,000 admissions support the authors' expectations and demonstrate that the high prediction accuracy is maintained well even when the number of admission types explicitly included in the model is increased to account for 98% of all admissions in the corpus. Thus several promising directions for future work are highlighted.Comment: In Proc. International Conference on Bioinformatics and Computational Biology, April 201

Towards objective and reproducible study of patient-doctor interaction : automatic text analysis based VR-CoDES annotation of consultation transcripts

Author: Arandelovic Ognjen
Birkett Charlotte
Humphris Gerald Michael
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/07/2017
Field of study

While increasingly appreciated for its importance,the interaction between health care professionals (HCP) and patients is notoriously difficult to study, with both methodological and practical challenges. The former has been addressed by the so-called Verona coding definitions of emotional sequences (VRCoDES)– a system for identifying and coding patient emotions and the corresponding HCP responses – shown to be reliable and informative in a number of independent studies in different health care delivery contexts. In the present work we focus on the practical challenge of the scalability of this coding system,namely on making it easily usable more widely and on applying it on larger patient cohorts. In particular, VR-CoDES is inherently complex and training is required to ensure consistent annotation of audio recordings or textual transcripts of consultations.Following up on our previous pilot investigation, in the present paper we describe the first automatic, computer based algorithm capable of providing coarse level coding of textual transcripts. We investigate different representations of patient utterances and classification methodologies, and label each utterance as either containing an explicit expression of emotional distress (a ‘concern’), an implicit one (a ‘cue’),or neither. Using a data corpus comprising 200 consultations between radiotherapists and adult female breast cancer patients we demonstrate excellent labelling performance.Postprin

Towards sophisticated learning from EHRs : increasing prediction specificity and accuracy using clinically meaningful risk criteria

Author: Arandelovic Ognjen
Vasiljeva Ieva
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/08/2016
Field of study

Computer based analysis of Electronic Health Records (EHRs) has the potential to provide major novel insights of benefit both to specific individuals in the context of personalized medicine, as well as on the level of population-wide health care and policy. The present paper introduces a novel algorithm that uses machine learning for the discovery of longitudinal patterns in the diagnoses of diseases. Two key technical novelties are introduced: one in the form of a novel learning paradigm which enables greater learning specificity, and another in the form of a risk driven identification of confounding diagnoses. We present a series of experiments which demonstrate the effectiveness of the proposed techniques, and which reveal novel insights regarding the most promising future research directions.Postprin

arXiv.org e-Print Archive

Using Twitter to learn about the autism community

Author: Arandjelovic Ognjen
Beykikhoshk Adham
Caelli Terry
Phung Dinh
Venkatesh Svetha
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 01/01/2015
Field of study

Considering the raising socio-economic burden of autism spectrum disorder (ASD), timely and evidence-driven public policy decision making and communication of the latest guidelines pertaining to the treatment and management of the disorder is crucial. Yet evidence suggests that policy makers and medical practitioners do not always have a good understanding of the practices and relevant beliefs of ASD-afflicted individuals' carers who often follow questionable recommendations and adopt advice poorly supported by scientific data. The key goal of the present work is to explore the idea that Twitter, as a highly popular platform for information exchange, could be used as a data-mining source to learn about the population affected by ASD -- their behaviour, concerns, needs etc. To this end, using a large data set of over 11 million harvested tweets as the basis for our investigation, we describe a series of experiments which examine a range of linguistic and semantic aspects of messages posted by individuals interested in ASD. Our findings, the first of their nature in the published scientific literature, strongly motivate additional research on this topic and present a methodological basis for further work.Comment: Social Network Analysis and Mining, 201

Deakin Research Online

Automated Identification of Individual Great White Sharks from Unrestricted Fin Imagery

Author: Burghardt Tilo
Hughes Benjamin J
Publication venue: 'British Machine Vision Association and Society for Pattern Recognition'
Publication date: 01/01/2015
Field of study

Explore Bristol Research

Multi-scale Regions from Edge Fragments:A Graph Theory Approach

Author: Andersen Hans Jørgen
Kazmi Wajahat
Publication venue: Institute for Systems and Technologies of Information, Control and Communication
Publication date: 01/01/2014
Field of study

VBN

Ancient Roman coin retrieval : a systematic examination of the effects of coin grade

Author: DG Lowe
H Anwar
H Anwar
H Bay
H Tang
M Heikkilä
M Kampel
M Zaharieva
O Arandjelović
OR Bidder
R Huber
W Pedrycz
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 13/04/2017
Field of study

Ancient coins are historical artefacts of great significance which attract the interest of scholars, and a large and growing number of amateur collectors. Computer vision based analysis and retrieval of ancient coins holds much promise in this realm, and has been the subject of an increasing amount of research. The present work is in great part motivated by the lack of systematic evaluation of the existing methods in the context of coin grade which is one of the key challenges both to humans and automatic methods. We describe a series of methods – some being adopted from previous work and others as extensions thereof – and perform the first thorough analysis to date.Postprin

Towards computer vision based ancient coin recognition in the wild — automatic reliable image preprocessing and normalization

Author: Arandelovic Ognjen
Conn Brandon
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 01/01/2017
Field of study

As an attractive area of application in the sphere of cultural heritage, in recent years automatic analysis of ancient coins has been attracting an increasing amount of research attention from the computer vision community. Recent work has demonstrated that the existing state of the art performs extremely poorly when applied on images acquired in realistic conditions. One of the reasons behind this lies in the (often implicit) assumptions made by many of the proposed algorithms — a lack of background clutter, and a uniform scale, orientation, and translation of coins across different images. These assumptions are not satisfied by default and before any further progress in the realm of more complex analysis is made, a robust method capable of preprocessing and normalizing images of coins acquired ‘in the wild’ is needed. In this paper we introduce an algorithm capable of localizing and accurately segmenting out a coin from a cluttered image acquired by an amateur collector. Specifically, we propose a two stage approach which first uses a simple shape hypothesis to localize the coin roughly and then arrives at the final, accurate result by refining this initial estimate using a statistical model learnt from large amounts of data. Our results on data collected ‘in the wild’ demonstrate excellent accuracy even when the proposed algorithm is applied on highly challenging images.Postprin

arXiv.org e-Print Archive

Descriptor transition tables for object retrieval using unconstrained cluttered video acquired using a consumer level handheld mobile device

Author: Arandelovic Ognjen
Rieutort-Louis Warren
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 21/03/2016
Field of study

Visual recognition and vision based retrieval of objects from large databases are tasks with a wide spectrum of potential applications. In this paper we propose a novel recognition method from video sequences suitable for retrieval from databases acquired in highly unconstrained conditions e.g. using a mobile consumer-level device such as a phone. On the lowest level, we represent each sequence as a 3D mesh of densely packed local appearance descriptors. While image plane geometry is captured implicitly by a large overlap of neighbouring regions from which the descriptors are extracted, 3D information is extracted by means of a descriptor transition table, learnt from a single sequence for each known gallery object. These allow us to connect local descriptors along the 3rd dimension (which corresponds to viewpoint changes), thus resulting in a set of variable length Markov chains for each video. The matching of two sets of such chains is formulated as a statistical hypothesis test, whereby a subset of each is chosen to maximize the likelihood that the corresponding video sequences show the same object. The effectiveness of the proposed algorithm is empirically evaluated on the Amsterdam Library of Object Images and a new highly challenging video data set acquired using a mobile phone. On both data sets our method is shown to be successful in recognition in the presence of background clutter and large viewpoint changes.Postprin