788 research outputs found

    Pure phase-encoded MRI and classification of solids

    Get PDF
    Here, the authors combine a pure phase-encoded magnetic resonance imaging (MRI) method with a new tissue-classification technique to make geometric models of a human tooth. They demonstrate the feasibility of three-dimensional imaging of solids using a conventional 11.7-T NMR spectrometer. In solid-state imaging, confounding line-broadening effects are typically eliminated using coherent averaging methods. Instead, the authors circumvent them by detecting the proton signal at a fixed phase-encode time following the radio-frequency excitation. By a judicious choice of the phase-encode time in the MRI protocol, the authors differentiate enamel and dentine sufficiently to successfully apply a new classification algorithm. This tissue-classification algorithm identifies the distribution of different material types, such as enamel and dentine, in volumetric data. In this algorithm, the authors treat a voxel as a volume, not as a single point, and assume that each voxel may contain more than one material. They use the distribution of MR image intensities within each voxel-sized volume to estimate the relative proportion of each material using a probabilistic approach. This combined approach, involving MRI and data classification, is directly applicable to bone imaging and hard-tissue contrast-based modeling of biological solids

    Text-based Editing of Talking-head Video

    No full text
    Editing talking-head video to change the speech content or to remove filler words is challenging. We propose a novel method to edit talking-head video based on its transcript to produce a realistic output video in which the dialogue of the speaker has been modified, while maintaining a seamless audio-visual flow (i.e. no jump cuts). Our method automatically annotates an input talking-head video with phonemes, visemes, 3D face pose and geometry, reflectance, expression and scene illumination per frame. To edit a video, the user has to only edit the transcript, and an optimization strategy then chooses segments of the input corpus as base material. The annotated parameters corresponding to the selected segments are seamlessly stitched together and used to produce an intermediate video representation in which the lower half of the face is rendered with a parametric face model. Finally, a recurrent video generation network transforms this representation to a photorealistic video that matches the edited transcript. We demonstrate a large variety of edits, such as the addition, removal, and alteration of words, as well as convincing language translation and full sentence synthesis

    3DTeethSeg'22: 3D Teeth Scan Segmentation and Labeling Challenge

    Full text link
    Teeth localization, segmentation, and labeling from intra-oral 3D scans are essential tasks in modern dentistry to enhance dental diagnostics, treatment planning, and population-based studies on oral health. However, developing automated algorithms for teeth analysis presents significant challenges due to variations in dental anatomy, imaging protocols, and limited availability of publicly accessible data. To address these challenges, the 3DTeethSeg'22 challenge was organized in conjunction with the International Conference on Medical Image Computing and Computer Assisted Intervention (MICCAI) in 2022, with a call for algorithms tackling teeth localization, segmentation, and labeling from intraoral 3D scans. A dataset comprising a total of 1800 scans from 900 patients was prepared, and each tooth was individually annotated by a human-machine hybrid algorithm. A total of 6 algorithms were evaluated on this dataset. In this study, we present the evaluation results of the 3DTeethSeg'22 challenge. The 3DTeethSeg'22 challenge code can be accessed at: https://github.com/abenhamadou/3DTeethSeg22_challengeComment: 29 pages, MICCAI 2022 Singapore, Satellite Event, Challeng

    Synergistic Visualization And Quantitative Analysis Of Volumetric Medical Images

    Get PDF
    The medical diagnosis process starts with an interview with the patient, and continues with the physical exam. In practice, the medical professional may require additional screenings to precisely diagnose. Medical imaging is one of the most frequently used non-invasive screening methods to acquire insight of human body. Medical imaging is not only essential for accurate diagnosis, but also it can enable early prevention. Medical data visualization refers to projecting the medical data into a human understandable format at mediums such as 2D or head-mounted displays without causing any interpretation which may lead to clinical intervention. In contrast to the medical visualization, quantification refers to extracting the information in the medical scan to enable the clinicians to make fast and accurate decisions. Despite the extraordinary process both in medical visualization and quantitative radiology, efforts to improve these two complementary fields are often performed independently and synergistic combination is under-studied. Existing image-based software platforms mostly fail to be used in routine clinics due to lack of a unified strategy that guides clinicians both visually and quan- titatively. Hence, there is an urgent need for a bridge connecting the medical visualization and automatic quantification algorithms in the same software platform. In this thesis, we aim to fill this research gap by visualizing medical images interactively from anywhere, and performing a fast, accurate and fully-automatic quantification of the medical imaging data. To end this, we propose several innovative and novel methods. Specifically, we solve the following sub-problems of the ul- timate goal: (1) direct web-based out-of-core volume rendering, (2) robust, accurate, and efficient learning based algorithms to segment highly pathological medical data, (3) automatic landmark- ing for aiding diagnosis and surgical planning and (4) novel artificial intelligence algorithms to determine the sufficient and necessary data to derive large-scale problems

    Harnessing AI for Speech Reconstruction using Multi-view Silent Video Feed

    Full text link
    Speechreading or lipreading is the technique of understanding and getting phonetic features from a speaker's visual features such as movement of lips, face, teeth and tongue. It has a wide range of multimedia applications such as in surveillance, Internet telephony, and as an aid to a person with hearing impairments. However, most of the work in speechreading has been limited to text generation from silent videos. Recently, research has started venturing into generating (audio) speech from silent video sequences but there have been no developments thus far in dealing with divergent views and poses of a speaker. Thus although, we have multiple camera feeds for the speech of a user, but we have failed in using these multiple video feeds for dealing with the different poses. To this end, this paper presents the world's first ever multi-view speech reading and reconstruction system. This work encompasses the boundaries of multimedia research by putting forth a model which leverages silent video feeds from multiple cameras recording the same subject to generate intelligent speech for a speaker. Initial results confirm the usefulness of exploiting multiple camera views in building an efficient speech reading and reconstruction system. It further shows the optimal placement of cameras which would lead to the maximum intelligibility of speech. Next, it lays out various innovative applications for the proposed system focusing on its potential prodigious impact in not just security arena but in many other multimedia analytics problems.Comment: 2018 ACM Multimedia Conference (MM '18), October 22--26, 2018, Seoul, Republic of Kore

    Visual analytics methods for shape analysis of biomedical images exemplified on rodent skull morphology

    Get PDF
    In morphometrics and its application fields like medicine and biology experts are interested in causal relations of variation in organismic shape to phylogenetic, ecological, geographical, epidemiological or disease factors - or put more succinctly by Fred L. Bookstein, morphometrics is "the study of covariances of biological form". In order to reveal causes for shape variability, targeted statistical analysis correlating shape features against external and internal factors is necessary but due to the complexity of the problem often not feasible in an automated way. Therefore, a visual analytics approach is proposed in this thesis that couples interactive visualizations with automated statistical analyses in order to stimulate generation and qualitative assessment of hypotheses on relevant shape features and their potentially affecting factors. To this end long established morphometric techniques are combined with recent shape modeling approaches from geometry processing and medical imaging, leading to novel visual analytics methods for shape analysis. When used in concert these methods facilitate targeted analysis of characteristic shape differences between groups, co-variation between different structures on the same anatomy and correlation of shape to extrinsic attributes. Here a special focus is put on accurate modeling and interactive rendering of image deformations at high spatial resolution, because that allows for faithful representation and communication of diminutive shape features, large shape differences and volumetric structures. The utility of the presented methods is demonstrated in case studies conducted together with a collaborating morphometrics expert. As exemplary model structure serves the rodent skull and its mandible that are assessed via computed tomography scans

    Learning to Transform Time Series with a Few Examples

    Get PDF
    We describe a semi-supervised regression algorithm that learns to transform one time series into another time series given examples of the transformation. This algorithm is applied to tracking, where a time series of observations from sensors is transformed to a time series describing the pose of a target. Instead of defining and implementing such transformations for each tracking task separately, our algorithm learns a memoryless transformation of time series from a few example input-output mappings. The algorithm searches for a smooth function that fits the training examples and, when applied to the input time series, produces a time series that evolves according to assumed dynamics. The learning procedure is fast and lends itself to a closed-form solution. It is closely related to nonlinear system identification and manifold learning techniques. We demonstrate our algorithm on the tasks of tracking RFID tags from signal strength measurements, recovering the pose of rigid objects, deformable bodies, and articulated bodies from video sequences. For these tasks, this algorithm requires significantly fewer examples compared to fully-supervised regression algorithms or semi-supervised learning algorithms that do not take the dynamics of the output time series into account
    corecore