10 research outputs found

    Karaoker: Alignment-free singing voice synthesis with speech training data

    Full text link
    Existing singing voice synthesis models (SVS) are usually trained on singing data and depend on either error-prone time-alignment and duration features or explicit music score information. In this paper, we propose Karaoker, a multispeaker Tacotron-based model conditioned on voice characteristic features that is trained exclusively on spoken data without requiring time-alignments. Karaoker synthesizes singing voice following a multi-dimensional template extracted from a source waveform of an unseen speaker/singer. The model is jointly conditioned with a single deep convolutional encoder on continuous data including pitch, intensity, harmonicity, formants, cepstral peak prominence and octaves. We extend the text-to-speech training objective with feature reconstruction, classification and speaker identification tasks that guide the model to an accurate result. Except for multi-tasking, we also employ a Wasserstein GAN training scheme as well as new losses on the acoustic model's output to further refine the quality of the model.Comment: Submitted to INTERSPEECH 202

    Self-supervised learning for robust voice cloning

    Full text link
    Voice cloning is a difficult task which requires robust and informative features incorporated in a high quality TTS system in order to effectively copy an unseen speaker's voice. In our work, we utilize features learned in a self-supervised framework via the Bootstrap Your Own Latent (BYOL) method, which is shown to produce high quality speech representations when specific audio augmentations are applied to the vanilla algorithm. We further extend the augmentations in the training procedure to aid the resulting features to capture the speaker identity and to make them robust to noise and acoustic conditions. The learned features are used as pre-trained utterance-level embeddings and as inputs to a Non-Attentive Tacotron based architecture, aiming to achieve multispeaker speech synthesis without utilizing additional speaker features. This method enables us to train our model in an unlabeled multispeaker dataset as well as use unseen speaker embeddings to copy a speaker's voice. Subjective and objective evaluations are used to validate the proposed model, as well as the robustness to the acoustic conditions of the target utterance.Comment: Accepted to INTERSPEECH 202

    Improved Text Emotion Prediction Using Combined Valence and Arousal Ordinal Classification

    Full text link
    Emotion detection in textual data has received growing interest in recent years, as it is pivotal for developing empathetic human-computer interaction systems. This paper introduces a method for categorizing emotions from text, which acknowledges and differentiates between the diversified similarities and distinctions of various emotions. Initially, we establish a baseline by training a transformer-based model for standard emotion classification, achieving state-of-the-art performance. We argue that not all misclassifications are of the same importance, as there are perceptual similarities among emotional classes. We thus redefine the emotion labeling problem by shifting it from a traditional classification model to an ordinal classification one, where discrete emotions are arranged in a sequential order according to their valence levels. Finally, we propose a method that performs ordinal classification in the two-dimensional emotion space, considering both valence and arousal scales. The results show that our approach not only preserves high accuracy in emotion prediction but also significantly reduces the magnitude of errors in cases of misclassification

    Identifying and profiling structural similarities between Spike of SARS-CoV-2 and other viral or host proteins with Machaon

    No full text
    Using protein structure to predict function, interactions, and evolutionary history is still an open challenge, with existing approaches relying extensively on protein homology and families. Here, we present Machaon, a data-driven method combining orientation invariant metrics on phi-psi angles, inter-residue contacts and surface complexity. It can be readily applied on whole structures or segments—such as domains and binding sites. Machaon was applied on SARS-CoV-2 Spike monomers of native, Delta and Omicron variants and identified correlations with a wide range of viral proteins from close to distant taxonomy ranks, as well as host proteins, such as ACE2 receptor. Machaon’s meta-analysis of the results highlights structural, chemical and transcriptional similarities between the Spike monomer and human proteins, indicating a multi-level viral mimicry. This extended analysis also revealed relationships of the Spike protein with biological processes such as ubiquitination and angiogenesis and highlighted different patterns in virus attachment among the studied variants

    From the Argonauts Mythological Sailors to the Argonautes RNA-Silencing Navigators: Their Emerging Roles in Human-Cell Pathologies

    No full text
    Regulation of gene expression has emerged as a fundamental element of transcript homeostasis. Key effectors in this process are the Argonautes (AGOs), highly specialized RNA-binding proteins (RBPs) that form complexes, such as the RNA-Induced Silencing Complex (RISC). AGOs dictate post-transcriptional gene-silencing by directly loading small RNAs and repressing their mRNA targets through small RNA-sequence complementarity. The four human highly-conserved family-members (AGO1, AGO2, AGO3, and AGO4) demonstrate multi-faceted and versatile roles in transcriptome’s stability, plasticity, and functionality. The post-translational modifications of AGOs in critical amino acid residues, the nucleotide polymorphisms and mutations, and the deregulation of expression and interactions are tightly associated with aberrant activities, which are observed in a wide spectrum of pathologies. Through constantly accumulating information, the AGOs’ fundamental engagement in multiple human diseases has recently emerged. The present review examines new insights into AGO-driven pathology and AGO-deregulation patterns in a variety of diseases such as in viral infections and propagations, autoimmune diseases, cancers, metabolic deficiencies, neuronal disorders, and human infertility. Altogether, AGO seems to be a crucial contributor to pathogenesis and its targeting may serve as a novel and powerful therapeutic tool for the successful management of diverse human diseases in the clinic

    Identifying and profiling structural similarities between Spike of SARS-CoV-2 and other viral or host proteins with Machaon

    Get PDF
    Abstract Using protein structure to predict function, interactions, and evolutionary history is still an open challenge, with existing approaches relying extensively on protein homology and families. Here, we present Machaon, a data-driven method combining orientation invariant metrics on phi-psi angles, inter-residue contacts and surface complexity. It can be readily applied on whole structures or segments—such as domains and binding sites. Machaon was applied on SARS-CoV-2 Spike monomers of native, Delta and Omicron variants and identified correlations with a wide range of viral proteins from close to distant taxonomy ranks, as well as host proteins, such as ACE2 receptor. Machaon’s meta-analysis of the results highlights structural, chemical and transcriptional similarities between the Spike monomer and human proteins, indicating a multi-level viral mimicry. This extended analysis also revealed relationships of the Spike protein with biological processes such as ubiquitination and angiogenesis and highlighted different patterns in virus attachment among the studied variants. Available at: https://machaonweb.com

    Dicing the Disease with Dicer: The Implications of Dicer Ribonuclease in Human Pathologies

    No full text
    Gene expression dictates fundamental cellular processes and its de-regulation leads to pathological conditions. A key contributor to the fine-tuning of gene expression is Dicer, an RNA-binding protein (RBPs) that forms complexes and affects transcription by acting at the post-transcriptional level via the targeting of mRNAs by Dicer-produced small non-coding RNAs. This review aims to present the contribution of Dicer protein in a wide spectrum of human pathological conditions, including cancer, neurological, autoimmune, reproductive and cardiovascular diseases, as well as viral infections. Germline mutations of Dicer have been linked to Dicer1 syndrome, a rare genetic disorder that predisposes to the development of both benign and malignant tumors, but the exact correlation of Dicer protein expression within the different cancer types is unclear, and there are contradictions in the data. Downregulation of Dicer is related to Geographic atrophy (GA), a severe eye-disease that is a leading cause of blindness in industrialized countries, as well as to psychiatric and neurological diseases such as depression and Parkinson’s disease, respectively. Both loss and upregulation of Dicer protein expression is implicated in severe autoimmune disorders, including psoriasis, ankylosing spondylitis, rheumatoid arthritis, multiple sclerosis and autoimmune thyroid diseases. Loss of Dicer contributes to cardiovascular diseases and causes defective germ cell differentiation and reproductive system abnormalities in both sexes. Dicer can also act as a strong antiviral with a crucial role in RNA-based antiviral immunity. In conclusion, Dicer is an essential enzyme for the maintenance of physiology due to its pivotal role in several cellular processes, and its loss or aberrant expression contributes to the development of severe human diseases. Further exploitation is required for the development of novel, more effective Dicer-based diagnostic and therapeutic strategies, with the goal of new clinical benefits and better quality of life for patients

    TarBase-v9.0 extends experimentally supported miRNA-gene interactions to cell-types and virally encoded miRNAs

    No full text
    TarBase is a reference database dedicated to produce, curate and deliver high quality experimentally-supported microRNA (miRNA) targets on protein-coding transcripts. In its latest version (v9.0, https://dianalab.e-ce.uth.gr/tarbasev9), it pushes the envelope by introducing virally-encoded miRNAs, interactions leading to target-directed miRNA degradation (TDMD) events and the largest collection of miRNA-gene interactions to date in a plethora of experimental settings, tissues and cell-types. It catalogues similar to 6 million entries, comprising similar to 2 million unique miRNA-gene pairs, supported by 37 experimental (high- and low-yield) protocols in 172 tissues and cell-types. Interactions are annotated with rich metadata including information on genes/transcripts, miRNAs, samples, experimental contexts and publications, while millions of miRNA-binding locations are also provided at cell-type resolution. A completely re-designed interface with state-of-the-art web technologies, incorporates more features, and allows flexible and ingenious use. The new interface provides the capability to design sophisticated queries with numerous filtering criteria including cell lines, experimental conditions, cell types, experimental methods, species and/or tissues of interest. Additionally, a plethora of fine-tuning capacities have been integrated to the platform, offering the refinement of the returned interactions based on miRNA confidence and expression levels, while boundless local retrieval of the offered interactions and metadata is enabled. Graphical Abstrac
    corecore