37 research outputs found

    GETMusic: Generating Any Music Tracks with a Unified Representation and Diffusion Framework

    Full text link
    Symbolic music generation aims to create musical notes, which can help users compose music, such as generating target instrument tracks based on provided source tracks. In practical scenarios where there's a predefined ensemble of tracks and various composition needs, an efficient and effective generative model that can generate any target tracks based on the other tracks becomes crucial. However, previous efforts have fallen short in addressing this necessity due to limitations in their music representations and models. In this paper, we introduce a framework known as GETMusic, with ``GET'' standing for ``GEnerate music Tracks.'' This framework encompasses a novel music representation ``GETScore'' and a diffusion model ``GETDiff.'' GETScore represents musical notes as tokens and organizes tokens in a 2D structure, with tracks stacked vertically and progressing horizontally over time. At a training step, each track of a music piece is randomly selected as either the target or source. The training involves two processes: In the forward process, target tracks are corrupted by masking their tokens, while source tracks remain as the ground truth; in the denoising process, GETDiff is trained to predict the masked target tokens conditioning on the source tracks. Our proposed representation, coupled with the non-autoregressive generative model, empowers GETMusic to generate music with any arbitrary source-target track combinations. Our experiments demonstrate that the versatile GETMusic outperforms prior works proposed for certain specific composition tasks.Comment: 13 pages, 4 figure

    EmoGen: Eliminating Subjective Bias in Emotional Music Generation

    Full text link
    Music is used to convey emotions, and thus generating emotional music is important in automatic music generation. Previous work on emotional music generation directly uses annotated emotion labels as control signals, which suffers from subjective bias: different people may annotate different emotions on the same music, and one person may feel different emotions under different situations. Therefore, directly mapping emotion labels to music sequences in an end-to-end way would confuse the learning process and hinder the model from generating music with general emotions. In this paper, we propose EmoGen, an emotional music generation system that leverages a set of emotion-related music attributes as the bridge between emotion and music, and divides the generation into two stages: emotion-to-attribute mapping with supervised clustering, and attribute-to-music generation with self-supervised learning. Both stages are beneficial: in the first stage, the attribute values around the clustering center represent the general emotions of these samples, which help eliminate the impacts of the subjective bias of emotion labels; in the second stage, the generation is completely disentangled from emotion labels and thus free from the subjective bias. Both subjective and objective evaluations show that EmoGen outperforms previous methods on emotion control accuracy and music quality respectively, which demonstrate our superiority in generating emotional music. Music samples generated by EmoGen are available via this link:https://ai-muzic.github.io/emogen/, and the code is available at this link:https://github.com/microsoft/muzic/.Comment: 12 pages, 7 page

    MuseCoco: Generating Symbolic Music from Text

    Full text link
    Generating music from text descriptions is a user-friendly mode since the text is a relatively easy interface for user engagement. While some approaches utilize texts to control music audio generation, editing musical elements in generated audio is challenging for users. In contrast, symbolic music offers ease of editing, making it more accessible for users to manipulate specific musical elements. In this paper, we propose MuseCoco, which generates symbolic music from text descriptions with musical attributes as the bridge to break down the task into text-to-attribute understanding and attribute-to-music generation stages. MuseCoCo stands for Music Composition Copilot that empowers musicians to generate music directly from given text descriptions, offering a significant improvement in efficiency compared to creating music entirely from scratch. The system has two main advantages: Firstly, it is data efficient. In the attribute-to-music generation stage, the attributes can be directly extracted from music sequences, making the model training self-supervised. In the text-to-attribute understanding stage, the text is synthesized and refined by ChatGPT based on the defined attribute templates. Secondly, the system can achieve precise control with specific attributes in text descriptions and offers multiple control options through attribute-conditioned or text-conditioned approaches. MuseCoco outperforms baseline systems in terms of musicality, controllability, and overall score by at least 1.27, 1.08, and 1.32 respectively. Besides, there is a notable enhancement of about 20% in objective control accuracy. In addition, we have developed a robust large-scale model with 1.2 billion parameters, showcasing exceptional controllability and musicality

    First come, First served: Enhancing the Convenience Store Service Experience

    Get PDF
    One distinctive characteristic of Taiwanese city streets is the omnipresence of convenience stores. These clean, brightly lit stores are in operation 24 hours a day, seven days a week, and offer a wide range of constantly updated lifestyle products and services. Past research in convenience stores have often overlooked the work experiences of convenience store employees, and their contribution to the overall service experience. Thus, the goal of this exploratory study is to explore the convenience store work environment, and to provide some suggestions for in-store technological enhancements. Data was collected through in-depth interviewing, field study observations and Living Lab methodologies. Our research reveals that convenience store employees experience several types of physical, mental and emotional strains throughout their shifts. These strains are often derived from excessive physical exertion and unpleasant interactions with customers. We suggest that certain in-store technological enhancements, such as seamless sensing and seamful actuating, can serve to alleviate employee sense of pressure and anxiety during customer interactions.</p

    MusicAgent: An AI Agent for Music Understanding and Generation with Large Language Models

    Full text link
    AI-empowered music processing is a diverse field that encompasses dozens of tasks, ranging from generation tasks (e.g., timbre synthesis) to comprehension tasks (e.g., music classification). For developers and amateurs, it is very difficult to grasp all of these task to satisfy their requirements in music processing, especially considering the huge differences in the representations of music data and the model applicability across platforms among various tasks. Consequently, it is necessary to build a system to organize and integrate these tasks, and thus help practitioners to automatically analyze their demand and call suitable tools as solutions to fulfill their requirements. Inspired by the recent success of large language models (LLMs) in task automation, we develop a system, named MusicAgent, which integrates numerous music-related tools and an autonomous workflow to address user requirements. More specifically, we build 1) toolset that collects tools from diverse sources, including Hugging Face, GitHub, and Web API, etc. 2) an autonomous workflow empowered by LLMs (e.g., ChatGPT) to organize these tools and automatically decompose user requests into multiple sub-tasks and invoke corresponding music tools. The primary goal of this system is to free users from the intricacies of AI-music tools, enabling them to concentrate on the creative aspect. By granting users the freedom to effortlessly combine tools, the system offers a seamless and enriching music experience

    Phase evolution and superconductivity enhancement in Se-substituted MoTe2_2 thin films

    Full text link
    The strong spin−-orbit coupling (SOC) and numerous crystal phases in few−-layer transition metal dichalcogenides (TMDCs) MX2_2 (M==W, Mo, and X==Te, Se, S) has led to a variety of novel physics, such as Ising superconductivity and quantum spin Hall effect realized in monolayer 2H−- and Td−-MX2_2, respectively. Consecutive tailoring of the MX2_2 structure from 2H to Td phase may realize the long−-sought topological superconductivity in one material system by incorporating superconductivity and quantum spin Hall effect together. In this work, by combing Raman spectrum, X-ray photoelectron spectrum (XPS), scanning transmission electron microscopy imaging (STEM) as well as electrical transport measurements, we demonstrate that a consecutively structural phase transitions from Td to 1T′' to 2H polytype can be realized as the Se-substitution concentration increases. More importantly, the Se−-substitution has been found to notably enhance the superconductivity of the MoTe2_2 thin film, which is interpreted as the introduction of the two−-band superconductivity. The chemical constituent induced phase transition offers a new strategy to study the s+−_{+-} superconductivity and the possible topological superconductivity as well as to develop phase−-sensitive devices based on MX2_2 materials.Comment: 27 pages, 5 figure

    Transport evidence of asymmetric spin-orbit coupling in few−-layer superconducting 1Td−-MoTe2_2

    Full text link
    Two-dimensional (2D) transition metal dichalcogenides (TMDCs) MX2 (M=W, Mo, Nb, and X=Te, Se, S) with strong spin-orbit coupling (SOC) possess plenty of novel physics including superconductivity. Due to the Ising SOC, monolayer NbSe2_2 and gated MoS2_2 of 2H structure can realize the Ising superconductivity phase, which manifests itself with in-plane upper critical field far exceeding Pauli paramagnetic limit. Surprisingly, we find that a few-layer 1Td structure MoTe2_2 also exhibits an in-plane upper critical field (Hc2,//H_{c2,//}) which goes beyond the Pauli paramagnetic limit. Importantly, the in-plane upper critical field shows an emergent two-fold symmetry which is different from the isotropic Hc2,//H_{c2,//} in 2H structure TMDCs. We show that this is a result of an asymmetric SOC in 1Td structure TMDCs. The asymmetric SOC is very strong and estimated to be on the order of tens of meV. Our work provides the first transport evidence of a new type of asymmetric SOC in TMDCs which may give rise to novel superconducting and spin transport properties. Moreover, our findings mostly depend on the symmetry of the crystal and apply to a whole class of 1Td TMDCs such as 1Td-WTe2_2 which is under intense study due to its topological properties.Comment: 34 pages, 12 figure

    A fast and robust open-switch fault diagnosis method for variable-speed PMSM system

    No full text
    Traditional open-switch fault diagnosis methods suffer from poor rapidity or robustness. To solve this issue, a new differential current observer-based fault diagnosis method is proposed in this article. With the designed differential observer, fault symptoms (residuals) can be generated and adopted for fault diagnosis easily. Considering that the residuals are sensitive to motor operating condition in the conventional model-based method due to model error, an adaptive fault detection threshold is designed. As a result, the false detection and missed detection caused by the change of working condition can be avoided, and stronger robustness against speed, load, and parameter variations can be achieved with superior rapidity compared with existing methods. Finally, the rapidity and robustness of the proposed fault diagnosis method are verified sufficiently through experimental results

    Identification and Construction of a Long Noncoding RNA Prognostic Risk Model for Stomach Adenocarcinoma Patients

    No full text
    Background. Long noncoding RNA-based prognostic biomarkers have demonstrated great potential in the diagnosis and prognosis of cancer patients. However, systematic assessment of a multiple lncRNA-composed prognostic risk model is lacking in stomach adenocarcinoma (STAD). This study is aimed at constructing a lncRNA-based prognostic risk model for STAD patients. Methods. RNA sequencing data and clinical information of STAD patients were retrieved from The Cancer Genome Atlas (TCGA) database. Differentially expressed lncRNAs (DElncRNAs) were identified using the R software. Univariate and multivariate Cox regression analyses were performed to construct a prognostic risk model. The survival analysis, C-index, and receiver operating characteristic (ROC) curve were employed to assess the sensitivity and specificity of the model. The results were verified using the GEPIA online tool and our clinical samples. Pearson correlation coefficient analysis, Gene Ontology (GO), and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment were performed to indicate the potential biological functions of the selected lncRNA. Results. A total of 1917 DElncRNAs were identified from 343 cases of STAD tissues and 30 cases of noncancerous tissues. According to univariate and multivariable Cox regression analyses, four DElncRNAs (AC129507.1, LINC02407, AL022316.1, and AP000695.2) were selected to establish a prognostic risk model. There was a significant difference in the overall survival between high-risk patients and low-risk patients based on this risk model. The C-index of the model was 0.652. The area under the curve (AUC) for the ROC curve was 0.769. GEPIA results confirmed the expression and prognostic significance of AP000695.2 in STAD. Our clinical data confirmed that upregulated expression of AP000695.2 was correlated with the T stage, distant metastasis, and TNM stage in STAD. GO and KEGG analyses demonstrated that AP000695.2 was closely related to the tumorigenesis process. Conclusions. In this study, we constructed a lncRNA-based prognostic risk model for STAD patients. Our study will provide novel insight into the diagnosis and prognosis of STAD patients
    corecore