17 research outputs found

    RADIO: Reference-Agnostic Dubbing Video Synthesis

    Full text link
    One of the most challenging problems in audio-driven talking head generation is achieving high-fidelity detail while ensuring precise synchronization. Given only a single reference image, extracting meaningful identity attributes becomes even more challenging, often causing the network to mirror the facial and lip structures too closely. To address these issues, we introduce RADIO, a framework engineered to yield high-quality dubbed videos regardless of the pose or expression in reference images. The key is to modulate the decoder layers using latent space composed of audio and reference features. Additionally, we incorporate ViT blocks into the decoder to emphasize high-fidelity details, especially in the lip region. Our experimental results demonstrate that RADIO displays high synchronization without the loss of fidelity. Especially in harsh scenarios where the reference frame deviates significantly from the ground truth, our method outperforms state-of-the-art methods, highlighting its robustness. Pre-trained model and codes will be made public after the review.Comment: Under revie

    Risk, Unexpected Uncertainty, and Estimation Uncertainty: Bayesian Learning in Unstable Settings

    Get PDF
    Recently, evidence has emerged that humans approach learning using Bayesian updating rather than (model-free) reinforcement algorithms in a six-arm restless bandit problem. Here, we investigate what this implies for human appreciation of uncertainty. In our task, a Bayesian learner distinguishes three equally salient levels of uncertainty. First, the Bayesian perceives irreducible uncertainty or risk: even knowing the payoff probabilities of a given arm, the outcome remains uncertain. Second, there is (parameter) estimation uncertainty or ambiguity: payoff probabilities are unknown and need to be estimated. Third, the outcome probabilities of the arms change: the sudden jumps are referred to as unexpected uncertainty. We document how the three levels of uncertainty evolved during the course of our experiment and how it affected the learning rate. We then zoom in on estimation uncertainty, which has been suggested to be a driving force in exploration, in spite of evidence of widespread aversion to ambiguity. Our data corroborate the latter. We discuss neural evidence that foreshadowed the ability of humans to distinguish between the three levels of uncertainty. Finally, we investigate the boundaries of human capacity to implement Bayesian learning. We repeat the experiment with different instructions, reflecting varying levels of structural uncertainty. Under this fourth notion of uncertainty, choices were no better explained by Bayesian updating than by (model-free) reinforcement learning. Exit questionnaires revealed that participants remained unaware of the presence of unexpected uncertainty and failed to acquire the right model with which to implement Bayesian updating

    Evaluating indoor positioning systems in a shopping mall : the lessons learned from the IPIN 2018 competition

    Get PDF
    The Indoor Positioning and Indoor Navigation (IPIN) conference holds an annual competition in which indoor localization systems from different research groups worldwide are evaluated empirically. The objective of this competition is to establish a systematic evaluation methodology with rigorous metrics both for real-time (on-site) and post-processing (off-site) situations, in a realistic environment unfamiliar to the prototype developers. For the IPIN 2018 conference, this competition was held on September 22nd, 2018, in Atlantis, a large shopping mall in Nantes (France). Four competition tracks (two on-site and two off-site) were designed. They consisted of several 1 km routes traversing several floors of the mall. Along these paths, 180 points were topographically surveyed with a 10 cm accuracy, to serve as ground truth landmarks, combining theodolite measurements, differential global navigation satellite system (GNSS) and 3D scanner systems. 34 teams effectively competed. The accuracy score corresponds to the third quartile (75th percentile) of an error metric that combines the horizontal positioning error and the floor detection. The best results for the on-site tracks showed an accuracy score of 11.70 m (Track 1) and 5.50 m (Track 2), while the best results for the off-site tracks showed an accuracy score of 0.90 m (Track 3) and 1.30 m (Track 4). These results showed that it is possible to obtain high accuracy indoor positioning solutions in large, realistic environments using wearable light-weight sensors without deploying any beacon. This paper describes the organization work of the tracks, analyzes the methodology used to quantify the results, reviews the lessons learned from the competition and discusses its future

    The IPIN 2019 Indoor Localisation Competition—Description and Results

    Get PDF
    IPIN 2019 Competition, sixth in a series of IPIN competitions, was held at the CNR Research Area of Pisa (IT), integrated into the program of the IPIN 2019 Conference. It included two on-site real-time Tracks and three off-site Tracks. The four Tracks presented in this paper were set in the same environment, made of two buildings close together for a total usable area of 1000 m 2 outdoors and and 6000 m 2 indoors over three floors, with a total path length exceeding 500 m. IPIN competitions, based on the EvAAL framework, have aimed at comparing the accuracy performance of personal positioning systems in fair and realistic conditions: past editions of the competition were carried in big conference settings, university campuses and a shopping mall. Positioning accuracy is computed while the person carrying the system under test walks at normal walking speed, uses lifts and goes up and down stairs or briefly stops at given points. Results presented here are a showcase of state-of-the-art systems tested side by side in real-world settings as part of the on-site real-time competition Tracks. Results for off-site Tracks allow a detailed and reproducible comparison of the most recent positioning and tracking algorithms in the same environment as the on-site Tracks

    An Assistive Role of a Machine Learning Network in Diagnosis of Middle Ear Diseases

    No full text
    The present study aimed to develop a machine learning network to diagnose middle ear diseases with tympanic membrane images and to identify its assistive role in the diagnostic process. The medical records of subjects who underwent ear endoscopy tests were reviewed. From these records, 2272 diagnostic tympanic membranes images were appropriately labeled as normal, otitis media with effusion (OME), chronic otitis media (COM), or cholesteatoma and were used for training. We developed the “ResNet18 + Shuffle” network and validated the model performance. Seventy-one representative cases were selected to test the final accuracy of the network and resident physicians. We asked 10 resident physicians to make diagnoses from tympanic membrane images with and without the help of the machine learning network, and the change of the diagnostic performance of resident physicians with the aid of the answers from the machine learning network was assessed. The devised network showed a highest accuracy of 97.18%. A five-fold validation showed that the network successfully diagnosed ear diseases with an accuracy greater than 93%. All resident physicians were able to diagnose middle ear diseases more accurately with the help of the machine learning network. The increase in diagnostic accuracy was up to 18% (1.4% to 18.4%). The machine learning network successfully classified middle ear diseases and was assistive to clinicians in the interpretation of tympanic membrane images

    Arf6 exacerbates allergic asthma through cell-to-cell transmission of ASC inflammasomes

    No full text
    Asthma is a chronic inflammatory disease of the airways associated with excess production of Th2 cytokines and lung eosinophil accumulation. This inflammatory response persists in spite of steroid administration that blocks autocrine/paracrine loops of inflammatory cytokines, and the detailed mechanisms underlying asthma exacerbation remain unclear. Here, we show that asthma exacerbation is triggered by airway macrophages through a prion-like cell-to-cell transmission of extracellular particulates, including ASC protein, that assemble inflammasomes and mediate IL-1 beta production. OVA-induced allergic asthma and associated IL-1 beta production were alleviated in mice with small GTPase Arf6-deficient macrophages. The extracellular ASC specks were slightly engulfed by Arf6(-/- )macrophages, and the IL-1 beta production was reduced in Arf6(-/-) macrophages compared with that in WT macrophages. Furthermore, pharmacological inhibition of the Arf6 guanine nucleotide exchange factor suppressed asthma-like allergic inflammation in OVA-challenged WT mice. Collectively, the Arf6-dependent intercellular transmission of extracellular ASC specks contributes to the amplification of allergic inflammation and subsequent asthma exacerbation

    Artificial Intelligence-Powered Spatial Analysis of Tumor-Infiltrating Lymphocytes as Complementary Biomarker for Immune Checkpoint Inhibition in Non-Small-Cell Lung Cancer

    No full text
    PURPOSE Biomarkers on the basis of tumor-infiltrating lymphocytes (TIL) are potentially valuable in predicting the effectiveness of immune checkpoint inhibitors (ICI). However, clinical application remains challenging because of methodologic limitations and laborious process involved in spatial analysis of TIL distribution in whole-slide images (WSI). METHODS We have developed an artificial intelligence (AI)-powered WSI analyzer of TIL in the tumor microenvironment that can define three immune phenotypes (IPs): inflamed, immune-excluded, and immune-desert. These IPs were correlated with tumor response to ICI and survival in two independent cohorts of patients with advanced non-small-cell lung cancer (NSCLC). RESULTS Inflamed IP correlated with enrichment in local immune cytolytic activity, higher response rate, and prolonged progression-free survival compared with patients with immune-excluded or immune-desert phenotypes. At the WSI level, there was significant positive correlation between tumor proportion score (TPS) as determined by the AI model and control TPS analyzed by pathologists (P < .001). Overall, 44.0% of tumors were inflamed, 37.1% were immune-excluded, and 18.9% were immune-desert. Incidence of inflamed IP in patients with programmed death ligand-1 TPS at < 1%, 1%-49%, and >= 50% was 31.7%, 42.5%, and 56.8%, respectively. Median progression-free survival and overall survival were, respectively, 4.1 months and 24.8 months with inflamed IP, 2.2 months and 14.0 months with immune-excluded IP, and 2.4 months and 10.6 months with immune-desert IP. CONCLUSION The AI-powered spatial analysis of TIL correlated with tumor response and progression-free survival of ICI in advanced NSCLC. This is potentially a supplementary biomarker to TPS as determined by a pathologist.Y
    corecore