112 research outputs found

    Deep Cross-Modal Audio-Visual Generation

    Full text link
    Cross-modal audio-visual perception has been a long-lasting topic in psychology and neurology, and various studies have discovered strong correlations in human perception of auditory and visual stimuli. Despite works in computational multimodal modeling, the problem of cross-modal audio-visual generation has not been systematically studied in the literature. In this paper, we make the first attempt to solve this cross-modal generation problem leveraging the power of deep generative adversarial training. Specifically, we use conditional generative adversarial networks to achieve cross-modal audio-visual generation of musical performances. We explore different encoding methods for audio and visual signals, and work on two scenarios: instrument-oriented generation and pose-oriented generation. Being the first to explore this new problem, we compose two new datasets with pairs of images and sounds of musical performances of different instruments. Our experiments using both classification and human evaluations demonstrate that our model has the ability to generate one modality, i.e., audio/visual, from the other modality, i.e., visual/audio, to a good extent. Our experiments on various design choices along with the datasets will facilitate future research in this new problem space

    Hierarchical Cross-Modal Talking Face Generationwith Dynamic Pixel-Wise Loss

    Full text link
    We devise a cascade GAN approach to generate talking face video, which is robust to different face shapes, view angles, facial characteristics, and noisy audio conditions. Instead of learning a direct mapping from audio to video frames, we propose first to transfer audio to high-level structure, i.e., the facial landmarks, and then to generate video frames conditioned on the landmarks. Compared to a direct audio-to-image approach, our cascade approach avoids fitting spurious correlations between audiovisual signals that are irrelevant to the speech content. We, humans, are sensitive to temporal discontinuities and subtle artifacts in video. To avoid those pixel jittering problems and to enforce the network to focus on audiovisual-correlated regions, we propose a novel dynamically adjustable pixel-wise loss with an attention mechanism. Furthermore, to generate a sharper image with well-synchronized facial movements, we propose a novel regression-based discriminator structure, which considers sequence-level information along with frame-level information. Thoughtful experiments on several datasets and real-world samples demonstrate significantly better results obtained by our method than the state-of-the-art methods in both quantitative and qualitative comparisons

    Proteasome activator 28A: A clinical biomarker and pharmaceutical target in acute cerebral infarction therapy

    Get PDF
    Purpose: To determine the dynamic changes in serum levels of PA28α in patients with acute cerebral infarction (ACI), and to investigate its correlation with infarct size and neurological deficit of the disease. Methods: A total of 100 ACI patients and 100 healthy volunteers were recruited from The First Affiliated Hospital of Xinxiang Medical University as case and control groups, respectively. Their serum levels of PA28α were determined by quantitative reverse transcription-polymerase chain reaction (qRT-PCR). The potential of PA28α in predicting the incidence of ACI was assessed by plotting ROC curves. Multivariate logistic regression analysis was conducted to investigate the risk factors of ACI. In addition, an ACI model in rats was established, and ACI rats were classified into 1, 3, 5, 7 and 14 day subgroups based on the duration post-ACI. Rats in the sham group served as control. Results: Serum level of PA28α was significantly higher in ACI patients than in controls. Moreover, the serum level of PA28α at admission was positively correlated to the NIHSS score and infarct volume of ACI patients. The level of PA28α in ACI rats gradually increased post-ACI, reaching a peak on day 7. The number of apoptotic brain cells in ACI rats gradually decreased after ACI. In addition, PA28α level was negatively correlated to the number of apoptotic brain cells in ACI rats (R2 = 0.5148, p < 0.001). Conclusion: The serum level of PA28α is elevated in ACI patients, and is positively correlated to infarct volume and neurological deficit of the disease. The dynamic change in brain cell apoptosis post-ACI is negatively correlated to the serum level of PA28α. These findings may provide theoretical basis for the diagnosis and treatment of ACI

    MasterRTL: A Pre-Synthesis PPA Estimation Framework for Any RTL Design

    Full text link
    In modern VLSI design flow, the register-transfer level (RTL) stage is a critical point, where designers define precise design behavior with hardware description languages (HDLs) like Verilog. Since the RTL design is in the format of HDL code, the standard way to evaluate its quality requires time-consuming subsequent synthesis steps with EDA tools. This time-consuming process significantly impedes design optimization at the early RTL stage. Despite the emergence of some recent ML-based solutions, they fail to maintain high accuracy for any given RTL design. In this work, we propose an innovative pre-synthesis PPA estimation framework named MasterRTL. It first converts the HDL code to a new bit-level design representation named the simple operator graph (SOG). By only adopting single-bit simple operators, this SOG proves to be a general representation that unifies different design types and styles. The SOG is also more similar to the target gate-level netlist, reducing the gap between RTL representation and netlist. In addition to the new SOG representation, MasterRTL proposes new ML methods for the RTL-stage modeling of timing, power, and area separately. Compared with state-of-the-art solutions, the experiment on a comprehensive dataset with 90 different designs shows accuracy improvement by 0.33, 0.22, and 0.15 in correlation for total negative slack (TNS), worst negative slack (WNS), and power, respectively.Comment: To be published in the Proceedings of 42nd IEEE/ACM International Conference on Computer-Aided Design (ICCAD), 202

    Effects of Chinese Medicine Tong xinluo on Diabetic Nephropathy via Inhibiting TGF- β

    Get PDF
    Diabetic nephropathy (DN) is a major cause of chronic kidney failure and characterized by interstitial and glomeruli fibrosis. Epithelial-to-mesenchymal transition (EMT) plays an important role in the pathogenesis of DN. Tong xinluo (TXL), a Chinese herbal compound, has been used in China with established therapeutic efficacy in patients with DN. To investigate the molecular mechanism of TXL improving DN, KK-Ay mice were selected as models for the evaluation of pathogenesis and treatment in DN. In vitro, TGF-β1 was used to induce EMT. Western blot (WB), immunofluorescence staining, and real-time polymerase chain reaction (RT-PCR) were applied to detect the changes of EMT markers in vivo and in vitro, respectively. Results showed the expressions of TGF-β1 and its downstream proteins smad3/p-smad3 were greatly reduced in TXL group; meantime, TXL restored the expression of smad7. As a result, the expressions of collagen IV (Col IV) and fibronectin (FN) were significantly decreased in TXL group. In vivo, 24 h-UAER (24-hour urine albumin excretion ratio) and BUN (blood urea nitrogen) were decreased and Ccr (creatinine clearance ratio) was increased in TXL group compared with DN group. In summary, the present study demonstrates that TXL successfully inhibits TGF-β1-induced epithelial-to-mesenchymal transition in DN, which may account for the therapeutic efficacy in TXL-mediated renoprotection
    • …
    corecore