Search CORE

9 research outputs found

Vid2speech: Speech Reconstruction from Silent Video

Author: Ephrat Ariel
Peleg Shmuel
Publication venue
Publication date: 09/01/2017
Field of study

Speechreading is a notoriously difficult task for humans to perform. In this paper we present an end-to-end model based on a convolutional neural network (CNN) for generating an intelligible acoustic speech signal from silent video frames of a speaking person. The proposed CNN generates sound features for each frame based on its neighboring frames. Waveforms are then synthesized from the learned speech features to produce intelligible speech. We show that by leveraging the automatic feature learning capabilities of a CNN, we can obtain state-of-the-art word intelligibility on the GRID dataset, and show promising results for learning out-of-vocabulary (OOV) words.Comment: Accepted for publication at ICASSP 201

arXiv.org e-Print Archive

Crossref

Seeing Through Noise: Visually Driven Speaker Separation and Enhancement

Author: Ephrat Ariel
Gabbay Aviv
Halperin Tavi
Peleg Shmuel
Publication venue
Publication date: 09/02/2018
Field of study

Isolating the voice of a specific person while filtering out other voices or background noises is challenging when video is shot in noisy environments. We propose audio-visual methods to isolate the voice of a single speaker and eliminate unrelated sounds. First, face motions captured in the video are used to estimate the speaker's voice, by passing the silent video frames through a video-to-speech neural network-based model. Then the speech predictions are applied as a filter on the noisy input audio. This approach avoids using mixtures of sounds in the learning process, as the number of such possible mixtures is huge, and would inevitably bias the trained model. We evaluate our method on two audio-visual datasets, GRID and TCD-TIMIT, and show that our method attains significant SDR and PESQ improvements over the raw video-to-speech predictions, and a well-known audio-only method.Comment: Supplementary video: https://www.youtube.com/watch?v=qmsyj7vAzo

arXiv.org e-Print Archive

Crossref

Lumiere: A Space-Time Diffusion Model for Video Generation

Author: Bar-Tal Omer
Chefer Hila
Dekel Tali
Ephrat Ariel
Herrmann Charles
Hur Junhwa
Li Yuanzhen
Liu Guanghui
Michaeli Tomer
Mosseri Inbar
Paiss Roni
Raj Amit
Rubinstein Michael
Sun Deqing
Tov Omer
Wang Oliver
Zada Shiran
Publication venue
Publication date: 05/02/2024
Field of study

We introduce Lumiere -- a text-to-video diffusion model designed for synthesizing videos that portray realistic, diverse and coherent motion -- a pivotal challenge in video synthesis. To this end, we introduce a Space-Time U-Net architecture that generates the entire temporal duration of the video at once, through a single pass in the model. This is in contrast to existing video models which synthesize distant keyframes followed by temporal super-resolution -- an approach that inherently makes global temporal consistency difficult to achieve. By deploying both spatial and (importantly) temporal down- and up-sampling and leveraging a pre-trained text-to-image diffusion model, our model learns to directly generate a full-frame-rate, low-resolution video by processing it in multiple space-time scales. We demonstrate state-of-the-art text-to-video generation results, and show that our design easily facilitates a wide range of content creation tasks and video editing applications, including image-to-video, video inpainting, and stylized generation.Comment: Webpage: https://lumiere-video.github.io/ | Video: https://www.youtube.com/watch?v=wxLr02Dz2S

arXiv.org e-Print Archive

Patient-specific and global convolutional neural networks for robust automatic liver tumor delineation in follow-up CT studies

Author: AB Miller
Ariel Ephrat
C Coghlin
D Wong
E Eisenhauer
EL Chen
H Greenspan
J Zhou
Jacob Sosna
JS Hong
K Mala
Leo Joskowicz
M Bilello
M Freiman
M Freiman
MS Hassouna
Naama Lev-Cohain
Refael Vivanti
RL Lewis
S Klein
W Li
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref