Search CORE

90 research outputs found

Field Operation Planning for Agricultural Vehicles: A Hierarchical Modeling Framework

Author: Ampatzidis Y.
Bochtis D.
Tsatsarelis C.
Vougioukas S.
Publication venue: International Commission of Agricultural Engineering
Publication date: 01/02/2007
Field of study

Rosana G. Moreira, Editor-in-Chief; Texas A&M UniversityThis is a paper from International Commission of Agricultural Engineering (CIGR, Commission Internationale du Genie Rural) E-Journal Volume 9 (2007): Field Operation Planning for Agricultural Vehicles: A Hierarchical Modeling Framework. Manuscript PM 06 021. Vol. IX. February, 2007

eCommons@Cornell

Video-driven speech reconstruction using generative adversarial networks

Author: Ma P
Pantic M
Petridis S
Vougioukas K
Publication venue: 'International Speech Communication Association'
Publication date: 14/06/2019
Field of study

Speech is a means of communication which relies on both audio and visual information. The absence of one modality can often lead to confusion or misinterpretation of information. In this paper we present an end-to-end temporal model capable of directly synthesising audio from silent video, without needing to transform to-and-from intermediate features. Our proposed approach, based on GANs is capable of producing natural sounding, intelligible speech which is synchronised with the video. The performance of our model is evaluated on the GRID dataset for both speaker dependent and speaker independent scenarios. To the best of our knowledge this is the first method that maps video directly to raw audio and the first to produce intelligible speech when tested on previously unseen speakers. We evaluate the synthesised audio not only based on the sound quality but also on the accuracy of the spoken words

arXiv.org e-Print Archive

Crossref

Spiral - Imperial College Digital Repository

Optimal Dynamic Motion Sequence Generation for Multiple Harvesters

Author: Ampatzidis Y.
Bochtis D.
Tsatsarelis C.
Vougioukas S.
Publication venue: International Commission of Agricultural Engineering
Publication date: 01/01/2007
Field of study

Rosana G. Moreira, Editor-in-Chief; Texas A&M UniversityThis is a paper from International Commission of Agricultural Engineering (CIGR, Commission Internationale du Genie Rural) E-Journal Volume 9 (2007): Optimal Dynamic Motion Sequence Generation for Multiple Harvesters. Manuscript ATOE 07 001. Vol. IX. July, 2007

CiteSeerX

Agricultural Engineering International (E-Journal, CIGR - International Commission of Agricultural Engineering)

eCommons@Cornell

Realistic speech-driven facial animation with GANs

Author: Pantic M
Petridis S
Vougioukas K
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 14/06/2019
Field of study

Speech-driven facial animation is the process that automatically synthesizes talking characters based on speech signals. The majority of work in this domain creates a mapping from audio features to visual features. This approach often requires post-processing using computer graphics techniques to produce realistic albeit subject dependent results. We present an end-to-end system that generates videos of a talking head, using only a still image of a person and an audio clip containing speech, without relying on handcrafted intermediate features. Our method generates videos which have (a) lip movements that are in sync with the audio and (b) natural facial expressions such as blinks and eyebrow movements. Our temporal GAN uses 3 discriminators focused on achieving detailed frames, audio-visual synchronization, and realistic expressions. We quantify the contribution of each component in our model using an ablation study and we provide insights into the latent representation of the model. The generated videos are evaluated based on sharpness, reconstruction quality, lip-reading accuracy, synchronization as well as their ability to generate natural blinks

arXiv.org e-Print Archive

Spiral - Imperial College Digital Repository

Speech-driven facial animations improve speech-in-noise comprehension of humans

Author: Ma P
Pantic M
Petridis S
Reichenbach T
Varano E
Vougioukas K
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 29/11/2021
Field of study

Understanding speech becomes a demanding task when the environment is noisy. Comprehension of speech in noise can be substantially improved by looking at the speaker’s face, and this audiovisual benefit is even more pronounced in people with hearing impairment. Recent advances in AI have allowed to synthesize photorealistic talking faces from a speech recording and a still image of a person’s face in an end-to-end manner. However, it has remained unknown whether such facial animations improve speech-in-noise comprehension. Here we consider facial animations produced by a recently introduced generative adversarial network (GAN), and show that humans cannot distinguish between the synthesized and the natural videos. Importantly, we then show that the end-to-end synthesized videos significantly aid humans in understanding speech in noise, although the natural facial motions yield a yet higher audiovisual benefit. We further find that an audiovisual speech recognizer (AVSR) benefits from the synthesized facial animations as well. Our results suggest that synthesizing facial motions from speech can be used to aid speech comprehension in difficult listening environments

PubMed Central

Spiral - Imperial College Digital Repository

Mucopolysaccharidosis VI

Author: A Cardoso-Santos
A Crawley
A Zanetti
AC Azevedo
AC Crawley
AC Crawley
AC Crawley
AM Montano
AM Rovelli
BJ Poorthuis
BK Burton
C Dionisi-Vici
C Lampe
C Tifft
CC Wang
CF Wippermann
CM Simonaro
CM Simonaro
CM Simonaro
CT Tan
D Auclair
D Auclair
D Auclair
D Baines
D Baines
D Isbrandt
D Isbrandt
D Matern
D Wang
DA Applegarth
DA Brooks
DA Brooks
DE Paterson
E Garrido
E Herskhovitz
E Miebach
EF Neufeld
EI Voskoboeva
F Baehner
G Civallero
G Malm
G Miller
G Wicker
G Yogalingam
GA Thorisson
GJ Gibson
GL Semenza
GP Schwartz
H Michelakakis
H Pilz
Helen Nicely
HR Taylor
J Nelson
J Nelson
JC Coelho
JJ Hopwood
JJ McGill
JL Ashworth
JM Trowbridge
JW Spranger
KH Kim
L Karageorgos
L Karageorgos
L Vedolin
LK Hein
LS Leung
M Burch
M Imaizumi
MA Simmons
Maroteaux-Lamy syndrome
MC Lange
MF Petry
MF Petry
MH Gelb
MR Chen
MV Munoz-Rojas
MV Munoz-Rojas
Naglazyme [BioMarin]
P Dickson
P Harmatz
P Harmatz
P Harmatz
P Harmatz
P Harmatz
P Harmatz
P Harmatz
P Maroteaux
Paul Harmatz
PJ Meikle
PM Hoogerbrugge
R Giugliani
R Giugliani
R Pinto
RB Lowry
Roger Sayle
S Bagewadi
S Hayflick
S Sandberg
S Vestermark
Sean Turbeville
SJ Swiedler
SL Shih
ST Koseoglu
ST Koseoglu
T Litjens
T Litjens
T Litjens
T Litjens
T Magalhaes
T Tonnesen
U Yis
V Valayannopoulos
V Valayannopoulos
VA Moyer
Vassili Valayannopoulos
VI Vougioukas
W Dou
W Krivit
W Krivit
W Krivit
WD Jin
WD Lin
XK Zhang
Y Li
YL Chan
Z Ortutay
Publication venue: BioMed Central
Publication date: 01/01/2010
Field of study

Mucopolysaccharidosis VI (MPS VI) is a lysosomal storage disease with progressive multisystem involvement, associated with a deficiency of arylsulfatase B leading to the accumulation of dermatan sulfate. Birth prevalence is between 1 in 43,261 and 1 in 1,505,160 live births. The disorder shows a wide spectrum of symptoms from slowly to rapidly progressing forms. The characteristic skeletal dysplasia includes short stature, dysostosis multiplex and degenerative joint disease. Rapidly progressing forms may have onset from birth, elevated urinary glycosaminoglycans (generally >100 μg/mg creatinine), severe dysostosis multiplex, short stature, and death before the 2nd or 3rd decades. A more slowly progressing form has been described as having later onset, mildly elevated glycosaminoglycans (generally <100 μg/mg creatinine), mild dysostosis multiplex, with death in the 4th or 5th decades. Other clinical findings may include cardiac valve disease, reduced pulmonary function, hepatosplenomegaly, sinusitis, otitis media, hearing loss, sleep apnea, corneal clouding, carpal tunnel disease, and inguinal or umbilical hernia. Although intellectual deficit is generally absent in MPS VI, central nervous system findings may include cervical cord compression caused by cervical spinal instability, meningeal thickening and/or bony stenosis, communicating hydrocephalus, optic nerve atrophy and blindness. The disorder is transmitted in an autosomal recessive manner and is caused by mutations in the ARSB gene, located in chromosome 5 (5q13-5q14). Over 130 ARSB mutations have been reported, causing absent or reduced arylsulfatase B (N-acetylgalactosamine 4-sulfatase) activity and interrupted dermatan sulfate and chondroitin sulfate degradation. Diagnosis generally requires evidence of clinical phenotype, arylsulfatase B enzyme activity <10% of the lower limit of normal in cultured fibroblasts or isolated leukocytes, and demonstration of a normal activity of a different sulfatase enzyme (to exclude multiple sulfatase deficiency). The finding of elevated urinary dermatan sulfate with the absence of heparan sulfate is supportive. In addition to multiple sulfatase deficiency, the differential diagnosis should also include other forms of MPS (MPS I, II IVA, VII), sialidosis and mucolipidosis. Before enzyme replacement therapy (ERT) with galsulfase (Naglazyme®), clinical management was limited to supportive care and hematopoietic stem cell transplantation. Galsulfase is now widely available and is a specific therapy providing improved endurance with an acceptable safety profile. Prognosis is variable depending on the age of onset, rate of disease progression, age at initiation of ERT and on the quality of the medical care provided

Crossref

Springer - Publisher Connector

Directory of Open Access Journals

PubMed Central

Recommended from our members

Row-sensing templates: A generic 3D sensor-based approach to robot localization with respect to orchard row centerlines

Author: Fei Z
Vougioukas S
Publication venue: eScholarship, University of California
Publication date: 01/09/2022
Field of study

Accurate robot localization relative to orchard row centerlines is essential for autonomous guidance where satellite signals are often obstructed by foliage. Existing sensor-based approaches rely on various features extracted from images and point clouds. However, any selected features are not available consistently, because the visual and geometrical characteristics of orchard rows change drastically when tree types, growth stages, canopy management practices, seasons, and weather conditions change. In this study, we introduce a novel localization method that does not rely on features; instead, it relies on the concept of a row-sensing template, which is the expected observation of a 3D sensor traveling in an orchard row, when the sensor is anywhere on the centerline and perfectly aligned with it. First, the template is built using a few measurements, provided that the sensor's true pose with respect to the centerline is available. Then, during navigation, the best pose estimate (and its confidence) is estimated by maximizing the match between the template and the sensed point cloud using particle-filtering. The method can adapt to various orchards and conditions by rebuilding the template. Experiments were performed in a vineyard, and in an orchard in different seasons. Results showed that the lateral mean absolute error (MAE) was less than 3.6% of the row width, and the heading MAE was less than 1.72°. Localization was robust, as errors did not increase when less than 75% of measurement points were missing. The results indicate that template-based localization can provide a generic approach for accurate and robust localization in real-world orchards

eScholarship - University of California

Decomposition of Agricultural Tasks into Robotic Behaviours

Author: Blackmore B. S.
Fountas S.
Jorgensen R.
Sorensen C. G.
Tang L.
Vougioukas S.
Publication venue: International Commission of Agricultural Engineering
Publication date: 01/10/2007
Field of study

Rosana G. Moreira, Editor-in-Chief; Texas A&M UniversityThis is a paper from International Commission of Agricultural Engineering (CIGR, Commission Internationale du Genie Rural) E-Journal Volume 9 (2007): Decomposition of Agricultural Tasks into Robotic Behaviours. Manuscript PM 07 006. Vol. IX. October, 2007

University of Southern Denmark Research Output

eCommons@Cornell

End-to-End Speech-Driven Facial Animation with Temporal GANs

Author: Pantic M
Petridis S
Vougioukas K
Publication venue
Publication date: 19/07/2018
Field of study

Speech-driven facial animation is the process which uses speech signals to automatically synthesize a talking character. The majority of work in this domain creates a mapping from audio features to visual features. This often requires post-processing using computer graphics techniques to produce realistic albeit subject dependent results. We present a system for generating videos of a talking head, using a still image of a person and an audio clip containing speech, that doesn't rely on any handcrafted intermediate features. To the best of our knowledge, this is the first method capable of generating subject independent realistic videos directly from raw audio. Our method can generate videos which have (a) lip movements that are in sync with the audio and (b) natural facial expressions such as blinks and eyebrow movements. We achieve this by using a temporal GAN with 2 discriminators, which are capable of capturing different aspects of the video. The effect of each component in our system is quantified through an ablation study. The generated videos are evaluated based on their sharpness, reconstruction quality, and lip-reading accuracy. Finally, a user study is conducted, confirming that temporal GANs lead to more natural sequences than a static GAN-based approach

arXiv.org e-Print Archive

Spiral - Imperial College Digital Repository