10 research outputs found
Karaoker: Alignment-free singing voice synthesis with speech training data
Existing singing voice synthesis models (SVS) are usually trained on singing
data and depend on either error-prone time-alignment and duration features or
explicit music score information. In this paper, we propose Karaoker, a
multispeaker Tacotron-based model conditioned on voice characteristic features
that is trained exclusively on spoken data without requiring time-alignments.
Karaoker synthesizes singing voice following a multi-dimensional template
extracted from a source waveform of an unseen speaker/singer. The model is
jointly conditioned with a single deep convolutional encoder on continuous data
including pitch, intensity, harmonicity, formants, cepstral peak prominence and
octaves. We extend the text-to-speech training objective with feature
reconstruction, classification and speaker identification tasks that guide the
model to an accurate result. Except for multi-tasking, we also employ a
Wasserstein GAN training scheme as well as new losses on the acoustic model's
output to further refine the quality of the model.Comment: Submitted to INTERSPEECH 202
Self-supervised learning for robust voice cloning
Voice cloning is a difficult task which requires robust and informative
features incorporated in a high quality TTS system in order to effectively copy
an unseen speaker's voice. In our work, we utilize features learned in a
self-supervised framework via the Bootstrap Your Own Latent (BYOL) method,
which is shown to produce high quality speech representations when specific
audio augmentations are applied to the vanilla algorithm. We further extend the
augmentations in the training procedure to aid the resulting features to
capture the speaker identity and to make them robust to noise and acoustic
conditions. The learned features are used as pre-trained utterance-level
embeddings and as inputs to a Non-Attentive Tacotron based architecture, aiming
to achieve multispeaker speech synthesis without utilizing additional speaker
features. This method enables us to train our model in an unlabeled
multispeaker dataset as well as use unseen speaker embeddings to copy a
speaker's voice. Subjective and objective evaluations are used to validate the
proposed model, as well as the robustness to the acoustic conditions of the
target utterance.Comment: Accepted to INTERSPEECH 202
Improved Text Emotion Prediction Using Combined Valence and Arousal Ordinal Classification
Emotion detection in textual data has received growing interest in recent
years, as it is pivotal for developing empathetic human-computer interaction
systems. This paper introduces a method for categorizing emotions from text,
which acknowledges and differentiates between the diversified similarities and
distinctions of various emotions. Initially, we establish a baseline by
training a transformer-based model for standard emotion classification,
achieving state-of-the-art performance. We argue that not all
misclassifications are of the same importance, as there are perceptual
similarities among emotional classes. We thus redefine the emotion labeling
problem by shifting it from a traditional classification model to an ordinal
classification one, where discrete emotions are arranged in a sequential order
according to their valence levels. Finally, we propose a method that performs
ordinal classification in the two-dimensional emotion space, considering both
valence and arousal scales. The results show that our approach not only
preserves high accuracy in emotion prediction but also significantly reduces
the magnitude of errors in cases of misclassification
Identifying and profiling structural similarities between Spike of SARS-CoV-2 and other viral or host proteins with Machaon
Using protein structure to predict function, interactions, and evolutionary history is still an open challenge, with existing approaches relying extensively on protein homology and families. Here, we present Machaon, a data-driven method combining orientation invariant metrics on phi-psi angles, inter-residue contacts and surface complexity. It can be readily applied on whole structures or segments—such as domains and binding sites. Machaon was applied on SARS-CoV-2 Spike monomers of native, Delta and Omicron variants and identified correlations with a wide range of viral proteins from close to distant taxonomy ranks, as well as host proteins, such as ACE2 receptor. Machaon’s meta-analysis of the results highlights structural, chemical and transcriptional similarities between the Spike monomer and human proteins, indicating a multi-level viral mimicry. This extended analysis also revealed relationships of the Spike protein with biological processes such as ubiquitination and angiogenesis and highlighted different patterns in virus attachment among the studied variants
From the Argonauts Mythological Sailors to the Argonautes RNA-Silencing Navigators: Their Emerging Roles in Human-Cell Pathologies
Regulation of gene expression has emerged as a fundamental element of transcript homeostasis. Key effectors in this process are the Argonautes (AGOs), highly specialized RNA-binding proteins (RBPs) that form complexes, such as the RNA-Induced Silencing Complex (RISC). AGOs dictate post-transcriptional gene-silencing by directly loading small RNAs and repressing their mRNA targets through small RNA-sequence complementarity. The four human highly-conserved family-members (AGO1, AGO2, AGO3, and AGO4) demonstrate multi-faceted and versatile roles in transcriptome’s stability, plasticity, and functionality. The post-translational modifications of AGOs in critical amino acid residues, the nucleotide polymorphisms and mutations, and the deregulation of expression and interactions are tightly associated with aberrant activities, which are observed in a wide spectrum of pathologies. Through constantly accumulating information, the AGOs’ fundamental engagement in multiple human diseases has recently emerged. The present review examines new insights into AGO-driven pathology and AGO-deregulation patterns in a variety of diseases such as in viral infections and propagations, autoimmune diseases, cancers, metabolic deficiencies, neuronal disorders, and human infertility. Altogether, AGO seems to be a crucial contributor to pathogenesis and its targeting may serve as a novel and powerful therapeutic tool for the successful management of diverse human diseases in the clinic
Identifying and profiling structural similarities between Spike of SARS-CoV-2 and other viral or host proteins with Machaon
Abstract Using protein structure to predict function, interactions, and evolutionary history is still an open challenge, with existing approaches relying extensively on protein homology and families. Here, we present Machaon, a data-driven method combining orientation invariant metrics on phi-psi angles, inter-residue contacts and surface complexity. It can be readily applied on whole structures or segments—such as domains and binding sites. Machaon was applied on SARS-CoV-2 Spike monomers of native, Delta and Omicron variants and identified correlations with a wide range of viral proteins from close to distant taxonomy ranks, as well as host proteins, such as ACE2 receptor. Machaon’s meta-analysis of the results highlights structural, chemical and transcriptional similarities between the Spike monomer and human proteins, indicating a multi-level viral mimicry. This extended analysis also revealed relationships of the Spike protein with biological processes such as ubiquitination and angiogenesis and highlighted different patterns in virus attachment among the studied variants. Available at: https://machaonweb.com
Dicing the Disease with Dicer: The Implications of Dicer Ribonuclease in Human Pathologies
Gene expression dictates fundamental cellular processes and its de-regulation leads to pathological conditions. A key contributor to the fine-tuning of gene expression is Dicer, an RNA-binding protein (RBPs) that forms complexes and affects transcription by acting at the post-transcriptional level via the targeting of mRNAs by Dicer-produced small non-coding RNAs. This review aims to present the contribution of Dicer protein in a wide spectrum of human pathological conditions, including cancer, neurological, autoimmune, reproductive and cardiovascular diseases, as well as viral infections. Germline mutations of Dicer have been linked to Dicer1 syndrome, a rare genetic disorder that predisposes to the development of both benign and malignant tumors, but the exact correlation of Dicer protein expression within the different cancer types is unclear, and there are contradictions in the data. Downregulation of Dicer is related to Geographic atrophy (GA), a severe eye-disease that is a leading cause of blindness in industrialized countries, as well as to psychiatric and neurological diseases such as depression and Parkinson’s disease, respectively. Both loss and upregulation of Dicer protein expression is implicated in severe autoimmune disorders, including psoriasis, ankylosing spondylitis, rheumatoid arthritis, multiple sclerosis and autoimmune thyroid diseases. Loss of Dicer contributes to cardiovascular diseases and causes defective germ cell differentiation and reproductive system abnormalities in both sexes. Dicer can also act as a strong antiviral with a crucial role in RNA-based antiviral immunity. In conclusion, Dicer is an essential enzyme for the maintenance of physiology due to its pivotal role in several cellular processes, and its loss or aberrant expression contributes to the development of severe human diseases. Further exploitation is required for the development of novel, more effective Dicer-based diagnostic and therapeutic strategies, with the goal of new clinical benefits and better quality of life for patients
TarBase-v9.0 extends experimentally supported miRNA-gene interactions to cell-types and virally encoded miRNAs
TarBase is a reference database dedicated to produce, curate and deliver high quality experimentally-supported microRNA (miRNA) targets on protein-coding transcripts. In its latest version (v9.0, https://dianalab.e-ce.uth.gr/tarbasev9), it pushes the envelope by introducing virally-encoded miRNAs, interactions leading to target-directed miRNA degradation (TDMD) events and the largest collection of miRNA-gene interactions to date in a plethora of experimental settings, tissues and cell-types. It catalogues similar to 6 million entries, comprising similar to 2 million unique miRNA-gene pairs, supported by 37 experimental (high- and low-yield) protocols in 172 tissues and cell-types. Interactions are annotated with rich metadata including information on genes/transcripts, miRNAs, samples, experimental contexts and publications, while millions of miRNA-binding locations are also provided at cell-type resolution. A completely re-designed interface with state-of-the-art web technologies, incorporates more features, and allows flexible and ingenious use. The new interface provides the capability to design sophisticated queries with numerous filtering criteria including cell lines, experimental conditions, cell types, experimental methods, species and/or tissues of interest. Additionally, a plethora of fine-tuning capacities have been integrated to the platform, offering the refinement of the returned interactions based on miRNA confidence and expression levels, while boundless local retrieval of the offered interactions and metadata is enabled. Graphical Abstrac