3,602 research outputs found

    Guided Proofreading of Automatic Segmentations for Connectomics

    Full text link
    Automatic cell image segmentation methods in connectomics produce merge and split errors, which require correction through proofreading. Previous research has identified the visual search for these errors as the bottleneck in interactive proofreading. To aid error correction, we develop two classifiers that automatically recommend candidate merges and splits to the user. These classifiers use a convolutional neural network (CNN) that has been trained with errors in automatic segmentations against expert-labeled ground truth. Our classifiers detect potentially-erroneous regions by considering a large context region around a segmentation boundary. Corrections can then be performed by a user with yes/no decisions, which reduces variation of information 7.5x faster than previous proofreading methods. We also present a fully-automatic mode that uses a probability threshold to make merge/split decisions. Extensive experiments using the automatic approach and comparing performance of novice and expert users demonstrate that our method performs favorably against state-of-the-art proofreading methods on different connectomics datasets.Comment: Supplemental material available at http://rhoana.org/guidedproofreading/supplemental.pd

    Semi-Automation in Video Editing

    Get PDF
    Semi-automasjon i video redigering Hvordan kan vi bruke kunstig intelligens (KI) og maskin læring til å gjøre videoredigering like enkelt som å redigere tekst? I denne avhandlingen vil jeg adressere problemet med å bruke KI i videoredigering fra et Menneskelig-KI interaksjons perspektiv, med fokus på å bruke KI til å støtte brukerne. Video er et audiovisuelt medium. Redigere videoer krever synkronisering av både det visuelle og det auditive med presise operasjoner helt ned på millisekund nivå. Å gjøre dette like enkelt som å redigere tekst er kanskje ikke mulig i dag. Men hvordan skal vi da støtte brukerne med KI og hva er utfordringene med å gjøre det? Det er fem hovedspørsmål som har drevet forskningen i denne avhandlingen. Hva er dagens "state-of-the-art" i KI støttet videoredigering? Hva er behovene og forventningene av fagfolkene om KI? Hva er påvirkningen KI har på effektiviteten og nøyaktigheten når det blir brukt på teksting? Hva er endringene i brukeropplevelsen når det blir brukt KI støttet teksting? Hvordan kan flere KI metoder bli brukt for å støtte beskjærings- og panoreringsoppgaver? Den første artikkelen av denne avhandlingen ga en syntese og kritisk gjennomgang av eksisterende arbeid med KI-baserte verktøy for videoredigering. Artikkelen ga også noen svar på hvordan og hva KI kan bli brukt til for å støtte brukere ved en undersøkelse utført av 14 fagfolk. Den andre studien presenterte en prototype av KI-støttet videoredigerings verktøy bygget på et eksisterende videoproduksjons program. I tillegg kom det en evaluasjon av både ytelse og brukeropplevelse på en KI-støttet teksting fra 24 nybegynnere. Den tredje studien beskrev et idiom-basert verktøy for å konvertere bredskjermsvideoer lagd for TV til smalere størrelsesforhold for mobil og sosiale medieplattformer. Den tredje studien utforsker også nye metoder for å utøve beskjæring og panorering ved å bruke fem forskjellige KI-modeller. Det ble også presentert en evaluering fra fem brukere. I denne avhandlingen brukte vi en brukeropplevelse og oppgave basert framgangsmåte, for å adressere det semi-automatiske i videoredigering.How can we use artificial intelligence (AI) and machine learning (ML) to make video editing as easy as "editing text''? In this thesis, this problem of using AI to support video editing is explored from the human--AI interaction perspective, with the emphasis on using AI to support users. Video is a dual-track medium with audio and visual tracks. Editing videos requires synchronization of these two tracks and precise operations at milliseconds. Making it as easy as editing text might not be currently possible. Then how should we support the users with AI, and what are the current challenges in doing so? There are five key questions that drove the research in this thesis. What is the start of the art in using AI to support video editing? What are the needs and expectations of video professionals from AI? What are the impacts on efficiency and accuracy of subtitles when AI is used to support subtitling? What are the changes in user experience brought on by AI-assisted subtitling? How can multiple AI methods be used to support cropping and panning task? In this thesis, we employed a user experience focused and task-based approach to address the semi-automation in video editing. The first paper of this thesis provided a synthesis and critical review of the existing work on AI-based tools for videos editing and provided some answers to how should and what more AI can be used in supporting users by a survey of 14 video professional. The second paper presented a prototype of AI-assisted subtitling built on a production grade video editing software. It is the first comparative evaluation of both performance and user experience of AI-assisted subtitling with 24 novice users. The third work described an idiom-based tool for converting wide screen videos made for television to narrower aspect ratios for mobile social media platforms. It explores a new method to perform cropping and panning using five AI models, and an evaluation with 5 users and a review with a professional video editor were presented.Doktorgradsavhandlin

    Explorations in interactive illustrative rendering

    Get PDF

    Robust semi-automated path extraction for visualising stenosis of the coronary arteries

    Get PDF
    Computed tomography angiography (CTA) is useful for diagnosing and planning treatment of heart disease. However, contrast agent in surrounding structures (such as the aorta and left ventricle) makes 3-D visualisation of the coronary arteries difficult. This paper presents a composite method employing segmentation and volume rendering to overcome this issue. A key contribution is a novel Fast Marching minimal path cost function for vessel centreline extraction. The resultant centreline is used to compute a measure of vessel lumen, which indicates the degree of stenosis (narrowing of a vessel). Two volume visualisation techniques are presented which utilise the segmented arteries and lumen measure. The system is evaluated and demonstrated using synthetic and clinically obtained datasets

    Machine learning of hierarchical clustering to segment 2D and 3D images

    Get PDF
    We aim to improve segmentation through the use of machine learning tools during region agglomeration. We propose an active learning approach for performing hierarchical agglomerative segmentation from superpixels. Our method combines multiple features at all scales of the agglomerative process, works for data with an arbitrary number of dimensions, and scales to very large datasets. We advocate the use of variation of information to measure segmentation accuracy, particularly in 3D electron microscopy (EM) images of neural tissue, and using this metric demonstrate an improvement over competing algorithms in EM and natural images.Comment: 15 pages, 8 figure

    Fireground location understanding by semantic linking of visual objects and building information models

    Get PDF
    This paper presents an outline for improved localization and situational awareness in fire emergency situations based on semantic technology and computer vision techniques. The novelty of our methodology lies in the semantic linking of video object recognition results from visual and thermal cameras with Building Information Models (BIM). The current limitations and possibilities of certain building information streams in the context of fire safety or fire incident management are addressed in this paper. Furthermore, our data management tools match higher-level semantic metadata descriptors of BIM and deep-learning based visual object recognition and classification networks. Based on these matches, estimations can be generated of camera, objects and event positions in the BIM model, transforming it from a static source of information into a rich, dynamic data provider. Previous work has already investigated the possibilities to link BIM and low-cost point sensors for fireground understanding, but these approaches did not take into account the benefits of video analysis and recent developments in semantics and feature learning research. Finally, the strengths of the proposed approach compared to the state-of-the-art is its (semi -)automatic workflow, generic and modular setup and multi-modal strategy, which allows to automatically create situational awareness, to improve localization and to facilitate the overall fire understanding
    corecore