16 research outputs found
Recommended from our members
Comparison of view-based and reconstruction-based models of human navigational strategy
There is good evidence that simple animals such as bees use view-based strategies to return to a familiar location but humans could use a 3D reconstruction to achieve the same goal. Assuming some noise in the storage and retrieval process, these two types of strategy give rise to different patterns of predicted errors in homing. We describe an experiment that can help distinguish between these models. Participants wore a head mounted display to carry out a homing task in immersive virtual reality. They viewed three long thin vertical poles and had to remember where they were in relation to the poles before being transported (virtually) to a new location in the scene from where they had to walk back to the original location. The experiment was conducted in both a rich-cue scene (a furnished room) and a sparse scene (no background and no floor or ceiling). As one would expect, in a rich-cue environment the overall error was smaller and in this case the ability to separate the models was reduced. However, for the sparse-cue environment the view-based model outperforms the reconstruction-based model. Specifically, the likelihood of the experimental data is similar to the likelihood of samples drawn from the view-based model (but assessed under both models) while this is not true for samples drawn from the reconstruction-based model
Seeing the arrow of time
URL to conference programWe explore whether we can observe Time’s Arrow in a temporal sequence–is it possible to tell whether a video is running forwards or backwards? We investigate this somewhat philosophical question using computer vision and machine learning techniques. We explore three methods by which we might detect Time’s Arrow in video sequences, based on distinct ways in which motion in video sequences might be asymmetric in time. We demonstrate good video forwards /backwards classification results on a selection of YouTube video clips, and on natively-captured sequences (with no temporally-dependent video compression), and examine what motions the models have learned that help discriminate forwards from backwards time.European Research Council (ERC grant VisRec no. 228180)National Basic Research Program of China (973 Program) (2013CB329503)National Natural Science Foundation (China) (NSFC Grant no. 91120301)United States. Office of Naval Research (ONR MURI grant N00014-09-1-1051)National Science Foundation (U.S.) (NSF CGV-1111415
Evaluation of a novel deep learning-based classifier for perifissural nodules
OBJECTIVES: To evaluate the performance of a novel convolutional neural network (CNN) for the classification of typical perifissural nodules (PFN). METHODS: Chest CT data from two centers in the UK and The Netherlands (1668 unique nodules, 1260 individuals) were collected. Pulmonary nodules were classified into subtypes, including "typical PFNs" on-site, and were reviewed by a central clinician. The dataset was divided into a training/cross-validation set of 1557 nodules (1103 individuals) and a test set of 196 nodules (158 individuals). For the test set, three radiologically trained readers classified the nodules into three nodule categories: typical PFN, atypical PFN, and non-PFN. The consensus of the three readers was used as reference to evaluate the performance of the PFN-CNN. Typical PFNs were considered as positive results, and atypical PFNs and non-PFNs were grouped as negative results. PFN-CNN performance was evaluated using the ROC curve, confusion matrix, and Cohen's kappa. RESULTS: Internal validation yielded a mean AUC of 91.9% (95% CI 90.6-92.9) with 78.7% sensitivity and 90.4% specificity. For the test set, the reader consensus rated 45/196 (23%) of nodules as typical PFN. The classifier-reader agreement (k = 0.62-0.75) was similar to the inter-reader agreement (k = 0.64-0.79). Area under the ROC curve was 95.8% (95% CI 93.3-98.4), with a sensitivity of 95.6% (95% CI 84.9-99.5), and specificity of 88.1% (95% CI 81.8-92.8). CONCLUSION: The PFN-CNN showed excellent performance in classifying typical PFNs. Its agreement with radiologically trained readers is within the range of inter-reader agreement. Thus, the CNN-based system has potential in clinical and screening settings to rule out perifissural nodules and increase reader efficiency. KEY POINTS: • Agreement between the PFN-CNN and radiologically trained readers is within the range of inter-reader agreement. • The CNN model for the classification of typical PFNs achieved an AUC of 95.8% (95% CI 93.3-98.4) with 95.6% (95% CI 84.9-99.5) sensitivity and 88.1% (95% CI 81.8-92.8) specificity compared to the consensus of three readers
Artificial Intelligence Tool for Assessment of Indeterminate Pulmonary Nodules Detected with CT
Background: Limited data are available regarding whether computer-aided diagnosis (CAD) improves assessment of malignancy risk in indeterminate pulmonary nodules (IPNs).
Purpose: To evaluate the effect of an artificial intelligence-based CAD tool on clinician IPN diagnostic performance and agreement for both malignancy risk categories and management recommendations.
Materials and Methods: This was a retrospective multireader multicase study performed in June and July 2020 on chest CT studies of IPNs. Readers used only CT imaging data and provided an estimate of malignancy risk and a management recommendation for each case without and with CAD. The effect of CAD on average reader diagnostic performance was assessed using the Obuchowski-Rockette and Dorfman-Berbaum-Metz method to calculate estimates of area under the receiver operating characteristic curve (AUC), sensitivity, and specificity. Multirater Fleiss κ statistics were used to measure interobserver agreement for malignancy risk and management recommendations.
Results: A total of 300 chest CT scans of IPNs with maximal diameters of 5-30 mm (50.0% malignant) were reviewed by 12 readers (six radiologists, six pulmonologists) (patient median age, 65 years; IQR, 59-71 years; 164 [55%] men). Readers\u27 average AUC improved from 0.82 to 0.89 with CAD (P \u3c .001). At malignancy risk thresholds of 5% and 65%, use of CAD improved average sensitivity from 94.1% to 97.9% (P = .01) and from 52.6% to 63.1% (P \u3c .001), respectively. Average reader specificity improved from 37.4% to 42.3% (P = .03) and from 87.3% to 89.9% (P = .05), respectively. Reader interobserver agreement improved with CAD for both the less than 5% (Fleiss κ, 0.50 vs 0.71; P \u3c .001) and more than 65% (Fleiss κ, 0.54 vs 0.71; P \u3c .001) malignancy risk categories. Overall reader interobserver agreement for management recommendation categories (no action, CT surveillance, diagnostic procedure) also improved with CAD (Fleiss κ, 0.44 vs 0.52; P = .001).
Conclusion: Use of computer-aided diagnosis improved estimation of indeterminate pulmonary nodule malignancy risk on chest CT scans and improved interobserver agreement for both risk stratification and management recommendations
Recommended from our members
Modelling human visual navigation using multi-view scene reconstruction
It is often assumed that humans generate a 3D reconstruction of the environment, either in egocentric or world-based coordinates, but the steps involved are unknown. Here, we propose two reconstruction-based models, evaluated using data from two tasks in immersive virtual reality. We model the observer’s prediction of landmark location based on standard photogrammetric methods and then combine location predictions to compute likelihood maps of navigation behaviour. In one model, each scene point is treated independently in the reconstruction; in the other, the pertinent variable is the spatial relationship between pairs of points. Participants viewed a simple environment from one location, were transported (virtually) to another part of the scene and were asked to navigate back. Error distributions varied substantially with changes in scene layout; we compared these directly with the likelihood maps to quantify the success of the models. We also measured error distributions when participants manipulated the location of a landmark to match the preceding interval, providing a direct test of the landmark-location stage of the navigation models. Models such as this, which start with scenes and end with a probabilistic prediction of behaviour, are likely to be increasingly useful for understanding 3D vision
Machine learning in multi-frame image super-resolution
Multi-frame image super-resolution is a procedure which takes several noisy low-resolution images of the same scene, acquired under different conditions, and processes them together to synthesize one or more high-quality super-resolution images, with higher spatial frequency, and less noise and image blur than any of the original images. The inputs can take the form of medical images, surveillance footage, digital video, satellite terrain imagery, or images from many other sources. This thesis focuses on Bayesian methods for multi-frame super-resolution, which use a prior distribution over the super-resolution image. The goal is to produce outputs which are as accurate as possible, and this is achieved through three novel super-resolution schemes presented in this thesis. Previous approaches obtained the super-resolution estimate by first computing and fixing the imaging parameters (such as image registration), and then computing the super-resolution image with this registration. In the first of the approaches taken here, superior results are obtained by optimizing over both the registrations and image pixels, creating a complete simultaneous algorithm. Additionally, parameters for the prior distribution are learnt automatically from data, rather than being set by trial and error. In the second approach, uncertainty in the values of the imaging parameters is dealt with by marginalization. In a previous Bayesian image super-resolution approach, the marginalization was over the super-resolution image, necessitating the use of an unfavorable image prior. By integrating over the imaging parameters rather than the image, the novel method presented here allows for more realistic prior distributions, and also reduces the dimension of the integral considerably, removing the main computational bottleneck of the other algorithm. Finally, a domain-specific image prior, based upon patches sampled from other images, is presented. For certain types of super-resolution problems where it is applicable, this sample-based prior gives a significant improvement in the super-resolution image quality.</p
Machine learning in multi-frame image super-resolution
EThOS - Electronic Theses Online ServiceGBUnited Kingdo
A Zisserman. Optimizing and learning for super-resolution
In multiple-image super-resolution, a high resolution image is estimated from a number of lower-resolution images. This involves computing the parameters of a generative imaging model (such as geometric and photometric registration, and blur) and obtaining a MAP estimate by minimizing a cost function including an appropriate prior. We consider the quite general geometric registration situation modelled by a plane projective transformation, and make two novel contributions: (i) in previous approaches the MAP estimate has been obtained by first computing and fixing the registration, and then computing the super-resolution image with this registration. We demonstrate that superior estimates are obtained by optimizing over both the registration and image; (ii) the parameters of the edge preserving prior are learnt automatically from the data, rather than being set by trial and error. We show examples on a number of real sequences including multiple stills, digital video, and DVDs of movies.
A sampled texture prior for image superresolution
Super-resolution aims to produce a high-resolution image from a set of one or more low-resolution images by recovering or inventing plausible high-frequency image content. Typical approaches try to reconstruct a high-resolution image using the sub-pixel displacements of several lowresolution images, usually regularized by a generic smoothness prior over the high-resolution image space. Other methods use training data to learn low-to-high-resolution matches, and have been highly successful even in the single-input-image case. Here we present a domain-specific image prior in the form of a p.d.f. based upon sampled images, and show that for certain types of super-resolution problems, this sample-based prior gives a significant improvement over other common multiple-image super-resolution techniques.