32 research outputs found

    Content-based Controls For Music Large Language Modeling

    Full text link
    Recent years have witnessed a rapid growth of large-scale language models in the domain of music audio. Such models enable end-to-end generation of higher-quality music, and some allow conditioned generation using text descriptions. However, the control power of text controls on music is intrinsically limited, as they can only describe music indirectly through meta-data (such as singers and instruments) or high-level representations (such as genre and emotion). We aim to further equip the models with direct and content-based controls on innate music languages such as pitch, chords and drum track. To this end, we contribute Coco-Mulla, a content-based control method for music large language modeling. It uses a parameter-efficient fine-tuning (PEFT) method tailored for Transformer-based audio models. Experiments show that our approach achieved high-quality music generation with low-resource semi-supervised learning, tuning with less than 4% parameters compared to the original model and training on a small dataset with fewer than 300 songs. Moreover, our approach enables effective content-based controls, and we illustrate the control power via chords and rhythms, two of the most salient features of music audio. Furthermore, we show that by combining content-based controls and text descriptions, our system achieves flexible music variation generation and style transfer. Our source codes and demos are available online

    Loop Copilot: Conducting AI Ensembles for Music Generation and Iterative Editing

    Full text link
    Creating music is iterative, requiring varied methods at each stage. However, existing AI music systems fall short in orchestrating multiple subsystems for diverse needs. To address this gap, we introduce Loop Copilot, a novel system that enables users to generate and iteratively refine music through an interactive, multi-round dialogue interface. The system uses a large language model to interpret user intentions and select appropriate AI models for task execution. Each backend model is specialized for a specific task, and their outputs are aggregated to meet the user's requirements. To ensure musical coherence, essential attributes are maintained in a centralized table. We evaluate the effectiveness of the proposed system through semi-structured interviews and questionnaires, highlighting its utility not only in facilitating music creation but also its potential for broader applications.Comment: Source code and demo video are available at \url{https://sites.google.com/view/loop-copilot

    Learning Transferable Spatiotemporal Representations from Natural Script Knowledge

    Full text link
    Pre-training on large-scale video data has become a common recipe for learning transferable spatiotemporal representations in recent years. Despite some progress, existing methods are mostly limited to highly curated datasets (e.g., K400) and exhibit unsatisfactory out-of-the-box representations. We argue that it is due to the fact that they only capture pixel-level knowledge rather than spatiotemporal commonsense, which is far away from cognition-level video understanding. Inspired by the great success of image-text pre-training (e.g., CLIP), we take the first step to exploit language semantics to boost transferable spatiotemporal representation learning. We introduce a new pretext task, Turning to Video for Transcript Sorting (TVTS), which sorts shuffled ASR scripts by attending to learned video representations. We do not rely on descriptive captions and learn purely from video, i.e., leveraging the natural transcribed speech knowledge to provide noisy but useful semantics over time. Furthermore, rather than the simple concept learning in vision-caption contrast, we encourage cognition-level temporal commonsense reasoning via narrative reorganization. The advantages enable our model to contextualize what is happening like human beings and seamlessly apply to large-scale uncurated video data in the real world. Note that our method differs from ones designed for video-text alignment (e.g., Frozen) and multimodal representation learning (e.g., Merlot). Our method demonstrates strong out-of-the-box spatiotemporal representations on diverse video benchmarks, e.g., +13.6% gains over VideoMAE on SSV2 via linear probing

    The Protective Role of Hyaluronic Acid in Cr(VI)-Induced Oxidative Damage in Corneal Epithelial Cells

    Get PDF
    Cr(VI) exposure could produce kinds of intermediates and reactive oxygen species, both of which were related to DNA damage. Hyaluronan (HA) has impressive biological functions and was reported to protect corneal epithelial cells against oxidative damage induced by ultraviolet B, benzalkonium chloride, and sodium lauryl sulfate. So the aim of our study was to investigate HA protection on human corneal epithelial (HCE) cells against Cr(VI)-induced toxic effects. The HCE cell lines were exposed to different concentrations of K2Cr2O7 (1.875, 3.75, 7.5, 15.0, and 30 μM) or a combination of K2Cr2O7 and 0.2% HA and incubated with different times (15 min, 30 min, and 60 min). Our data showed that Cr(VI) exposure could cause decreased cell viability, increased DNA damage, and ROS generation to the HCE cell lines. But incubation of HA increased HCE cell survival rates and decreased DNA damage and ROS generation induced by Cr(VI) in a dose- and time-dependent manner. We report for the first time that HA can protect HCE cells against the toxicity of Cr(VI), indicating that it will be a promising therapeutic agent to corneal injuries caused by Cr(VI)

    DeePMD-kit v2: A software package for Deep Potential models

    Full text link
    DeePMD-kit is a powerful open-source software package that facilitates molecular dynamics simulations using machine learning potentials (MLP) known as Deep Potential (DP) models. This package, which was released in 2017, has been widely used in the fields of physics, chemistry, biology, and material science for studying atomistic systems. The current version of DeePMD-kit offers numerous advanced features such as DeepPot-SE, attention-based and hybrid descriptors, the ability to fit tensile properties, type embedding, model deviation, Deep Potential - Range Correction (DPRc), Deep Potential Long Range (DPLR), GPU support for customized operators, model compression, non-von Neumann molecular dynamics (NVNMD), and improved usability, including documentation, compiled binary packages, graphical user interfaces (GUI), and application programming interfaces (API). This article presents an overview of the current major version of the DeePMD-kit package, highlighting its features and technical details. Additionally, the article benchmarks the accuracy and efficiency of different models and discusses ongoing developments.Comment: 51 pages, 2 figure

    Orai1 and Orai3 Mediate Store-Operated Calcium Entry Contributing to Neuronal Excitability in Dorsal Root Ganglion Neurons

    No full text
    Store-operated calcium channels (SOCs) are highly calcium-selective channels that mediate calcium entry in various cell types. We have previously reported that intraplantar injection of YM-58483 (a SOC inhibitor) attenuates chronic pain. A previous study has reported that the function of SOCs in dorsal root ganglia (DRG) is enhanced after nerve injury, suggesting that SOCs may play a peripheral role in chronic pain. However, the expression, functional distribution and significance of the SOC family in DRG neurons remain elusive and the key components that mediate SOC entry (SOCE) are still controversial. Here, we demonstrated that the SOC family (STIM1, STIM2, Orai1, Orai2, and Orai3) was expressed in DRGs and STIM1 was mainly present in small- and medium-sized DRG neurons. Using confocal live cell imaging, Ca2+ imaging and electrophysiology techniques, we demonstrated that depletion of the endoplasmic reticulum Ca2+ stores induced STIM1 and STIM2 translocation, and that inhibition of STIM1 or blockage of Orai channels with pharmacological tools attenuated SOCE and SOC currents. Using the small inhibitory RNA knockdown approach, we identified STIM1, STIM2, Orai1, and Orai3 as the key components of SOCs mediating SOCE in DRG neurons. Importantly, activation of SOCs by thapsigargin induced plasma membrane depolarization and increased neuronal excitability, which were completely abolished by inhibition of SOCs or double knockdown of Orai1 and Orai3. Our findings suggest that SOCs exert an excitatory action in DRG neurons and provide a potential peripheral mechanism for modulation of pain hypersensitivity by SOC inhibition

    PvCT: A Publicly Verifiable Contact Tracing Algorithm in Cloud Computing

    No full text
    Contact tracing is a critical tool in containing epidemics such as COVID-19. Researchers have carried out a lot of work on contact tracing. However, almost all of the existing works assume that their clients and authorities have large storage space and powerful computation capability and clients can implement contact tracing on their own mobile devices such as mobile phones, tablet computers, and wearable computers. With the widespread outbreaks of the epidemics, these approaches are of less robustness to a larger scale of datasets when it comes to resource-constrained clients. To address this limitation, we propose a publicly verifiable contact tracing algorithm in cloud computing (PvCT), which utilizes cloud services to provide storage and computation capability in contact tracing. To guarantee the integrity and accuracy of contact tracing results, PvCT applies a novel set accumulator-based authentication data structure whose computation is outsourced, and the client can check whether returned results are valid. Furthermore, we provide rigorous security proof of our algorithm based on the q-Strong Bilinear Diffie–Hellman assumption. Detailed experimental evaluation is also conducted on three real-world datasets. The results show that our algorithm is feasible within milliseconds of client CPU time and can significantly reduce the storage overhead from the size of datasets to a constant 128 bytes

    Alkali-metal ions (M+ = Li+, Na+, K+, Rb+, and Cs+) endohedral cyclo[18]carbon (C18): Exploring binding interactions and predicting optical properties

    No full text
    With growing interest in carbon-based materials for energy storage and active research in the field of advanced optoelectronic devices, we theoretically designed ten complexes by cyclo[18]carbon (C18) inside and outside complexing alkali-metal ions (M+ = Li+, Na+, K+, Rb+, and Cs+), respectively referred to as M+@C18in and M+@C18out, and performed careful analyses of binding interaction between M+ and C18 as well as optical properties of stable endohedral complexes M+@C18in. The effects of atomic number of alkali-metals on electronic structures, binding interactions, electronic absorption spectra, and molecular (hyper)polarizabilities of the M+@C18in were studied using accurate density functional theory (DFT) calculations. The research results indicated that the differences in radius and properties of M+ lead to different binding modes and strengths with C18, but there is no difference in electronic absorption spectra between the complexes; the polarizability and second hyperpolarizability of M+@C18in containing different alkali-metal ions are similar due to their analogous electronic structures, but their first hyperpolarizability differ greatly due to discrepancies in molecular symmetry. The similarities and differences in intramolecular interactions, electronic absorption spectra, and (hyper)polarizability of M+@C18in were explored using advanced wavefunction analysis methods
    corecore