8,607 research outputs found

    A Fully Time-domain Neural Model for Subband-based Speech Synthesizer

    Full text link
    This paper introduces a deep neural network model for subband-based speech synthesizer. The model benefits from the short bandwidth of the subband signals to reduce the complexity of the time-domain speech generator. We employed the multi-level wavelet analysis/synthesis to decompose/reconstruct the signal into subbands in time domain. Inspired from the WaveNet, a convolutional neural network (CNN) model predicts subband speech signals fully in time domain. Due to the short bandwidth of the subbands, a simple network architecture is enough to train the simple patterns of the subbands accurately. In the ground truth experiments with teacher-forcing, the subband synthesizer outperforms the fullband model significantly in terms of both subjective and objective measures. In addition, by conditioning the model on the phoneme sequence using a pronunciation dictionary, we have achieved the fully time-domain neural model for subband-based text-to-speech (TTS) synthesizer, which is nearly end-to-end. The generated speech of the subband TTS shows comparable quality as the fullband one with a slighter network architecture for each subband.Comment: 5 pages, 3 figur

    CSR Disclosure in Three Market Economies: A Longitudinal Content Analysis of the Manifestation of Ethics, the Coverage of Stakeholders, the Transparency of Information and the CSR Themes from an Institutional Perspective

    Get PDF
    Drawing on the institutional theory, this content analysis investigated CSR communication in 750 corporate reports spanning a 10-year period from 150 companies from liberal market economies (LMEs: the US and UK), coordinated market economies (CMEs: Germany and Japan), and state-led market economies (SLMEs: France and South Korea). While CSR communication did not become explicit over time in terms of the form of communication, the total page counts indicated significant increase from earlier to later periods, suggesting more explicit CSR communication. Also, significant increases in the scope and depth of stakeholders as well as the transparency of messages were found. The emphasis on the supplier significantly increased over time. The most relatively prominent stakeholder and CSR theme was the employee and the environment, respectively. The SLMEs – while exhibiting significantly more implicit CSR communication than the other market economies – showed market-driven CSR through the significantly higher emphasis on the shareholder than the LMEs, higher relative prominence of the shareholder and the CSR theme of economic responsibility than the other market economies, and the significantly decreasing emphasis on the employee. The LMEs deviated from the characteristics as shareholder-based market economies. The LMEs showed significantly higher relative prominence of the stakeholder groups of the government and community, as well as the CSR theme of the community, than the other market economies. Additionally, the relative prominence of the investor was significantly lower in the LMEs than in the CMEs. The CMEs showed significantly lower attention to ethics than the other market economies, with a trend of decrease from the first to last period. However, the relative prominence of the CSR theme of business ethics – which includes other areas such as human rights as well as ethics – was significantly higher in the CMEs than the other market economies. Additionally, the transparency of messages was significantly higher in the CMES than the other market economies. The titles of CSR communications significantly differed, with the corporate citizenship title used significantly more and the sustainability title less in the LMEs, while CSR is less used in the SLMEs. Other theoretical and practical implications are discussed.Doctor of Philosoph

    Love flows downstream: mothers’ and children’s neural representation similarity in perceiving distress of self and family

    Get PDF
    The current study aimed to capture empathy processing in an interpersonal context. Mother–adolescent dyads (N = 22) each completed an empathy task during fMRI, in which they imagined the target person in distressing scenes as either themselves or their family (i.e. child for the mother, mother for the child). Using multi-voxel pattern approach, we compared neural pattern similarity for the self and family conditions and found that mothers showed greater perceptual similarity between self and child in the fusiform face area (FFA), representing high self–child overlap, whereas adolescents showed significantly less self–mother overlap. Adolescents’ pattern similarity was dependent upon family relationship quality, such that they showed greater self–mother overlap with higher relationship quality, whereas mothers’ pattern similarity was independent of relationship quality. Furthermore, adolescents’ perceptual similarity in the FFA was associated with increased social brain activation (e.g. temporal parietal junction). Mediation analyses indicated that high relationship quality was associated with greater social brain activation, which was mediated by greater self–mother overlap in the FFA. Our findings suggest that adolescents show more distinct neural patterns in perceiving their own vs their mother’s distress, and such distinction is sensitive to mother–child relationship quality. In contrast, mothers’ perception for their own and child’s distress is highly similar and unconditional

    Adjusting Pleasure-Arousal-Dominance for Continuous Emotional Text-to-speech Synthesizer

    Full text link
    Emotion is not limited to discrete categories of happy, sad, angry, fear, disgust, surprise, and so on. Instead, each emotion category is projected into a set of nearly independent dimensions, named pleasure (or valence), arousal, and dominance, known as PAD. The value of each dimension varies from -1 to 1, such that the neutral emotion is in the center with all-zero values. Training an emotional continuous text-to-speech (TTS) synthesizer on the independent dimensions provides the possibility of emotional speech synthesis with unlimited emotion categories. Our end-to-end neural speech synthesizer is based on the well-known Tacotron. Empirically, we have found the optimum network architecture for injecting the 3D PADs. Moreover, the PAD values are adjusted for the speech synthesis purpose.Comment: Interspeech2019, Show and Tell demonstration https://www.youtube.com/watch?v=MAOk_ZxuA0I&feature=youtu.b

    Dynamical mean-field theory of Hubbard-Holstein model at half-filling: Zero temperature metal-insulator and insulator-insulator transitions

    Full text link
    We study the Hubbard-Holstein model, which includes both the electron-electron and electron-phonon interactions characterized by UU and gg, respectively, employing the dynamical mean-field theory combined with Wilson's numerical renormalization group technique. A zero temperature phase diagram of metal-insulator and insulator-insulator transitions at half-filling is mapped out which exhibits the interplay between UU and gg. As UU (gg) is increased, a metal to Mott-Hubbard insulator (bipolaron insulator) transition occurs, and the two insulating states are distinct and can not be adiabatically connected. The nature of and transitions between the three states are discussed.Comment: 5 pages, 4 figures. Submitted to Physical Review Letter

    Triclinic Na3.12Co2.44(P2O7)(2) as a High Redox Potential Cathode Material for Na-Ion Batteries

    Get PDF
    Two types of sodium cobalt pyrophosphates, triclinic Na3.12Co2.44(P2O7)(2) and orthorhombic Na2CoP2O7, are compared as high-voltage cathode materials for Na-ion batteries. Na2CoP2O7 shows no electrochemical activity, delivering negligible capacity. In contrast, Na3.12Co2.44(P2O7)(2) exhibits good electrochemical performance, such as high redox potential at ca. 4.3 V (vs. Na/Na+) and stable capacity retention over 50 cycles, although Na3.12Co2.44(P2O7)(2) delivered approximately 40 mA h g(-1). This is attributed to the fact that Na2CoP2O7 (similar to 3.1 angstrom) has smaller diffusion channel size than Na3.12Co2.44(P2O7)(2) (similar to 4.2 angstrom). Moreover, the electrochemical performance of Na3.12Co2.44(P2O7)(2) is examined using Na cells and Li cells. The overpotential of Na cells is smaller than that of Li cells. This is due to the fact that Na3.12Co2.44(P2O7)(2) has a smaller charge transfer resistance and higher diffusivity for Na+ ions than Li+ ions. This implies that the large channel size of Na3.12Co2.44(P2O7)(2) is more appropriate for Na+ ions than Li+ ions. Therefore, Na3.12Co2.44(P2O7)(2) is considered a promising high-voltage cathode material for Na-ion batteries, if new electrolytes, which are stable above 4.5 V vs. Na/Na+, are introduced.

    Families that fire together smile together: Resting state connectome similarity and daily emotional synchrony in parent-child dyads

    Get PDF
    Despite emerging evidence suggesting a biological basis to our social tiles, our understanding of the neural processes which link two minds is unknown. We implemented a novel approach, which included connectome similarity analysis using resting state intrinsic networks of parent-child dyads as well as daily diaries measured across 14 days. Intrinsic resting-state networks for both parents and their adolescent child were identified using independent component analysis (ICA). Results indicate that parents and children who had more similar RSN connectome also had more similar day-to-day emotional synchrony. Furthermore, dyadic RSN connectome similarity was associated with children's emotional competence, suggesting that being neurally in-tune with their parents confers emotional benefits. We provide the first evidence that dyadic RSN similarity is associated with emotional synchrony in what is often our first and most essential social bond, the parent-child relationship
    • …
    corecore