727 research outputs found

    Techniques for modeling and analyzing RNA and protein folding energy landscapes

    Get PDF
    RNA and protein molecules undergo a dynamic folding process that is important to their function. Computational methods are critical for studying this folding pro- cess because it is difficult to observe experimentally. In this work, we introduce new computational techniques to study RNA and protein energy landscapes, includ- ing a method to approximate an RNA energy landscape with a coarse graph (map) and new tools for analyzing graph-based approximations of RNA and protein energy landscapes. These analysis techniques can be used to study RNA and protein fold- ing kinetics such as population kinetics, folding rates, and the folding of particular subsequences. In particular, a map-based Master Equation (MME) method can be used to analyze the population kinetics of the maps, while another map analysis tool, map-based Monte Carlo (MMC) simulation, can extract stochastic folding pathways from the map. To validate the results, I compared our methods with other computational meth- ods and with experimental studies of RNA and protein. I first compared our MMC and MME methods for RNA with other computational methods working on the com- plete energy landscape and show that the approximate map captures the major fea- tures of a much larger (e.g., by orders of magnitude) complete energy landscape. Moreover, I show that the methods scale well to large molecules, e.g., RNA with 200+ nucleotides. Then, I correlate the computational results with experimental findings. I present comparisons with two experimental cases to show how I can pre- dict kinetics-based functional rates of ColE1 RNAII and MS2 phage RNA and their mutants using our MME and MMC tools respectively. I also show that the MME and MMC tools can be applied to map-based approximations of protein energy energy landscapes and present kinetics analysis results for several proteins

    Hearth: A Game Supporting Non-Intrusive and Concurrent Tracking of Player Emotion and Mouse Usage

    Get PDF
    Empirical evidence has supported the idea that eSports players\u27 emotions could be reflected in their mouse usage. Still, findings from IS literature on the exact relationships between users\u27 mouse usage patterns and their emotional states have been mixed. Possible causes include adjustment effects and offsetting effects. To address these problems, this study proposes a self-developed game named Hearth, which supports non-intrusive and concurrent tracking of players\u27 emotions and mouse usage. The game design supports the examination of the two possible effects. Results show that negative emotion was positively associated with the total mouse movement distance in a game turn, average task-level distance, and average task-level speed. Moreover, the open-source game proposed in this study facilitates further data collection from natural experiments due to its triadic design that addresses reality, meaning, and play

    Sentiment Analysis on Inflation after Covid-19

    Get PDF
    We implement traditional machine learning and deep learning methods for global tweets from 2017-2022 to build a high-frequency measure of the public's sentiment index on inflation and analyze its correlation with other online data sources such as google trend and market-oriented inflation index. We use manually labeled trigrams to test the prediction performance of several machine learning models(logistic regression,random forest etc.) and choose Bert model for final demonstration. Later, we sum daily tweets' sentiment scores gained from Bert model to obtain the predicted inflation sentiment index, and we further analyze the regional and pre/post covid patterns of these inflation indexes. Lastly, we take other empirical inflation-related data as references and prove that twitter-based inflation sentiment analysis method has an outstanding capability to predict inflation. The results suggest that Twitter combined with deep learning methods can be a novel and timely method to utilize existing abundant data sources on inflation expectations and provide daily indicators of consumers' perception on inflation.Comment: 18 pages, 12 figure

    Price of Stability in Quality-Aware Federated Learning

    Full text link
    Federated Learning (FL) is a distributed machine learning scheme that enables clients to train a shared global model without exchanging local data. The presence of label noise can severely degrade the FL performance, and some existing studies have focused on algorithm design for label denoising. However, they ignored the important issue that clients may not apply costly label denoising strategies due to them being self-interested and having heterogeneous valuations on the FL performance. To fill this gap, we model the clients' interactions as a novel label denoising game and characterize its equilibrium. We also analyze the price of stability, which quantifies the difference in the system performance (e.g., global model accuracy, social welfare) between the equilibrium outcome and the socially optimal solution. We prove that the equilibrium outcome always leads to a lower global model accuracy than the socially optimal solution does. We further design an efficient algorithm to compute the socially optimal solution. Numerical experiments on MNIST dataset show that the price of stability increases as the clients' data become noisier, calling for an effective incentive mechanism.Comment: Accepted to IEEE GLOBECOM 202

    A Dialectical Analysis on Upgrading Underdeveloped Guangdong Agriculture with Digital Ecological Industry

    Get PDF
    The upgrading of underdeveloped Guangdong agriculture is analysed by Materialist dialectics. Agriculture should not be seen as a symbol of backwardness, but rather as an important ecological industry that can be upgraded with advanced digital science and technology. Cognitive innovation and environmental innovation are emphasized in attracting innovative talents and supporting digital ecological industry. The upgrading path of the digital ecological industry highlights top-level design and systematic planning. Overall, the document emphasizes the strategic value of agriculture and the potential for Guangdong to play a dominant role in the Regional Comprehensive Economic Partnership (RCEP) through digital ecological industry upgrading. Digitalization may enable Guangdong integrate agriculture, industry, and service industries to achieve ecological civilization

    Analyzing Survival Data for Sequentially Randomized Designs

    Get PDF
    Sequentially randomized designs are becoming common in biomedical research, particularlyin clinical trials. These trials are usually designed to evaluate and compare the effect ofdifferent treatment regimes. In such designs, eligible patients are first randomly assignedto receive one of the initial treatments. Patients meeting some criteria (e.g. no progressive diseases) are then randomized to receive one of the maintenance treatments. Usually, the procedure continues until all treatment options are exhausted. Such multistage treatment assignment results in dynamic treatment regimes consisting of initial treatment, intermediate response and second stage treatment. However, methods for effcient analysis of sequentially randomized trials have only been developed very recently. As a result, earlier clinical trials reported results based only on the comparison of stage-specific treatments.We first propose to use accelerated failure time and proportional hazards models for estimating the effects of treatment regimes from sequentially randomized designs. Based onthe proposed models, differences between treatment regimes in terms of their hazards aretested. We investigate the properties of these methods and tests in a Monte Carlo simulationstudy. Finally the proposed models are applied to the long-term outcome of the high riskneuroblastoma study.We then extend the proportional hazards model to a generalized Cox proportional hazards model that applies to comparisons of any combination of any number of treatment regimes regardless of the number of stages of treatment. Contrasts of dynamic treatment regimes are tested using the Wald chi-square method. Both the model and Wald chi-square tests of contrasts are illustrated through a simulation study and an application to a high risk neuroblastoma study to complement the earlier results reported on this study.Chronic diseases such as cancer and cardiovascular diseases are major causes of mortality and morbidity in the United States and in the world. Sequentially randomized designs arecommonly used in clinical studies investigating treatments of chronic diseases such as cancer,AIDS, and depression. The public health significance of the methodologies proposed in thisresearch is to allow efficient analysis of data from such studies and thereby enhance thediscovery of efficient maintenance and eradication strategies for chronic diseases

    Techniques for modeling and analyzing RNA and protein folding energy landscapes

    Get PDF
    RNA and protein molecules undergo a dynamic folding process that is important to their function. Computational methods are critical for studying this folding pro- cess because it is difficult to observe experimentally. In this work, we introduce new computational techniques to study RNA and protein energy landscapes, includ- ing a method to approximate an RNA energy landscape with a coarse graph (map) and new tools for analyzing graph-based approximations of RNA and protein energy landscapes. These analysis techniques can be used to study RNA and protein fold- ing kinetics such as population kinetics, folding rates, and the folding of particular subsequences. In particular, a map-based Master Equation (MME) method can be used to analyze the population kinetics of the maps, while another map analysis tool, map-based Monte Carlo (MMC) simulation, can extract stochastic folding pathways from the map. To validate the results, I compared our methods with other computational meth- ods and with experimental studies of RNA and protein. I first compared our MMC and MME methods for RNA with other computational methods working on the com- plete energy landscape and show that the approximate map captures the major fea- tures of a much larger (e.g., by orders of magnitude) complete energy landscape. Moreover, I show that the methods scale well to large molecules, e.g., RNA with 200+ nucleotides. Then, I correlate the computational results with experimental findings. I present comparisons with two experimental cases to show how I can pre- dict kinetics-based functional rates of ColE1 RNAII and MS2 phage RNA and their mutants using our MME and MMC tools respectively. I also show that the MME and MMC tools can be applied to map-based approximations of protein energy energy landscapes and present kinetics analysis results for several proteins

    NEW TEST STATISTIC FOR COMPARING MEDIANS WITH INCOMPLETE PAIRED DATA

    Get PDF
    This paper is concerned with nonparametric methods for comparing medians of paired data with unpaired values on both responses. A new nonparametric test statistic is proposed in this paper based on a Mann-Whitney U test making comparisons across complete and incomplete pairs. A method of finding the null hypothesis distribution for this statistic is presented using a permutation approach. A Monte Carlo simulation study is described to make power comparisons among four already-existing nonparametric test statistics and this new test statistic. It is concluded that this new test statistic is fairly powerful in handling this kind of data compared to the other four test statistics. Finally, all five test statistics are applied to a real dataset for comparing the proportions of certain T cell receptor gene families in a cancer study. The introduction of this new nonparametric test statistic is of public health importance because it is a powerful statistical method for dealing with a pattern of missing data that may be encountered in clinical and public health research

    Epigenetic regulation of genes related to lipid metabolism by microrna in mice fed high fat diet

    Get PDF
    High fat diet impacts lipid metabolism by altering the transportation, oxidation, and storage of fatty acids. Lipoprotein lipase (LPL) plays a critical role in lipid metabolism by catalyzing triglyceride hydrolysis and lipoprotein uptake in multiple tissues. A previous study reported that miR-29b negatively regulated LPL expression in mammary epithelial cells transfected with miR-29b mimics. The present study investigated changes in LPL expression and epigenetic mechanisms by the miR-29 family in different tissues in mice fed a high fat diet. Five-week old male CBA mice were fed with either a control diet (Con group, 10% kcal from fat) or a high fat diet (HF group, 45% kcal from fat) ad libitum for 11 weeks. The results showed that LPL mRNA was increased in adipose, muscle and colon in response to high fat diet. However, LPL mRNA expression decreased in the liver by high fat diet as well as hepatic lipase (HL). The results also showed the highest expression level of LPL mRNA in adipose tissue, followed by muscle, colon, and liver. Meanwhile, high fat diet reduced the expression of miR-29a/b, predicted suppressors of LPL from miR-29 family, in adipose tissue. Genomic analysis predicted several potential transcription factors of miR-29 family members that suppress the expression of miR-29s. At mRNA level, some of these transcription factors, c-Myc and EZH2, were significantly activated in response to HF diet. The present results indicated that the LPL expression could be activated by high fat diet in multiple tissues and the induction of LPL is post-transcriptionally regulated by miR-29a/b. Furthermore, the transcription of miR-29 in mice adipose was regulated by certain transcriptional factors. Overall, LPL mRNA altered in multiple tissues in response to high fat diet and is potentially regulated through transcription factors and microRNAs
    corecore