482 research outputs found

    Quantitative comparison of EST libraries requires compensation for systematic biases in cDNA generation

    Get PDF
    BACKGROUND: Publicly accessible EST libraries contain valuable information that can be utilized for studies of tissue-specific gene expression and processing of individual genes. This information is, however, confounded by multiple systematic effects arising from the procedures used to generate these libraries. RESULTS: We used alignment of ESTs against a reference set of transcripts to estimate the size distributions of the cDNA inserts and sampled mRNA transcripts in individual EST libraries and show how these measurements can be used to inform quantitative comparisons of libraries. While significant attention has been paid to the effects of normalization and substraction, we also find significant biases in transcript sampling introduced by the combined procedures of reverse transcription and selection of cDNA clones for sequencing. Using examples drawn from studies of mRNA 3'-processing (cleavage and polyadenylation), we demonstrate effects of the transcript sampling bias, and provide a method for identifying libraries that can be safely compared without bias. All data sets, supplemental data, and software are available at our supplemental web site [1]. CONCLUSION: The biases we characterize in the transcript sampling of EST libraries represent a significant and heretofore under-appreciated source of false positive candidates for tissue-, cell type-, or developmental stage-specific activity or processing of genes. Uncorrected, quantitative comparison of dissimilar EST libraries will likely result in the identification of statistically significant, but biologically meaningless changes

    Learn Goal-Conditioned Policy with Intrinsic Motivation for Deep Reinforcement Learning

    Full text link
    It is of significance for an agent to learn a widely applicable and general-purpose policy that can achieve diverse goals including images and text descriptions. Considering such perceptually-specific goals, the frontier of deep reinforcement learning research is to learn a goal-conditioned policy without hand-crafted rewards. To learn this kind of policy, recent works usually take as the reward the non-parametric distance to a given goal in an explicit embedding space. From a different viewpoint, we propose a novel unsupervised learning approach named goal-conditioned policy with intrinsic motivation (GPIM), which jointly learns both an abstract-level policy and a goal-conditioned policy. The abstract-level policy is conditioned on a latent variable to optimize a discriminator and discovers diverse states that are further rendered into perceptually-specific goals for the goal-conditioned policy. The learned discriminator serves as an intrinsic reward function for the goal-conditioned policy to imitate the trajectory induced by the abstract-level policy. Experiments on various robotic tasks demonstrate the effectiveness and efficiency of our proposed GPIM method which substantially outperforms prior techniques.Comment: Accepted by AAAI-2

    CLUE: Calibrated Latent Guidance for Offline Reinforcement Learning

    Full text link
    Offline reinforcement learning (RL) aims to learn an optimal policy from pre-collected and labeled datasets, which eliminates the time-consuming data collection in online RL. However, offline RL still bears a large burden of specifying/handcrafting extrinsic rewards for each transition in the offline data. As a remedy for the labor-intensive labeling, we propose to endow offline RL tasks with a few expert data and utilize the limited expert data to drive intrinsic rewards, thus eliminating the need for extrinsic rewards. To achieve that, we introduce \textbf{C}alibrated \textbf{L}atent g\textbf{U}idanc\textbf{E} (CLUE), which utilizes a conditional variational auto-encoder to learn a latent space such that intrinsic rewards can be directly qualified over the latent space. CLUE's key idea is to align the intrinsic rewards consistent with the expert intention via enforcing the embeddings of expert data to a calibrated contextual representation. We instantiate the expert-driven intrinsic rewards in sparse-reward offline RL tasks, offline imitation learning (IL) tasks, and unsupervised offline RL tasks. Empirically, we find that CLUE can effectively improve the sparse-reward offline RL performance, outperform the state-of-the-art offline IL baselines, and discover diverse skills from static reward-free offline data

    Data Fusion of Electronic Nose and Electronic Tongue for Detection of Mixed Edible-Oil

    Get PDF
    For the problem of the waste of the edible-oil in the food processing, on the premise of food security, they often need to add new edible-oil to the old frying oil which had been used in food processing to control the cost of the production. Due to the fact that the different additive proportion of the oil has different material and different volatile gases, we use fusion technology based on the electronic nose and electronic tongue to detect the blending ratio of the old frying oil and the new edible-oil in this paper. Principal component analysis (PCA) is used to distinguish the different proportion of the old frying oil and new edible-oil; on the other hand we use partial least squares (PLS) to predict the blending ratio of the old frying oil and new edible-oil. Two conclusions were proposed: data fusion of electronic nose and electronic tongue can be used to detect the blending ratio of the old frying oil and new edible-oil; in contrast to single used electronic nose or single used electronic tongue, the detection effect has increased by using data fusion of electronic nose and electronic tongue

    Beyond Reward: Offline Preference-guided Policy Optimization

    Full text link
    This study focuses on the topic of offline preference-based reinforcement learning (PbRL), a variant of conventional reinforcement learning that dispenses with the need for online interaction or specification of reward functions. Instead, the agent is provided with fixed offline trajectories and human preferences between pairs of trajectories to extract the dynamics and task information, respectively. Since the dynamics and task information are orthogonal, a naive approach would involve using preference-based reward learning followed by an off-the-shelf offline RL algorithm. However, this requires the separate learning of a scalar reward function, which is assumed to be an information bottleneck of the learning process. To address this issue, we propose the offline preference-guided policy optimization (OPPO) paradigm, which models offline trajectories and preferences in a one-step process, eliminating the need for separately learning a reward function. OPPO achieves this by introducing an offline hindsight information matching objective for optimizing a contextual policy and a preference modeling objective for finding the optimal context. OPPO further integrates a well-performing decision policy by optimizing the two objectives iteratively. Our empirical results demonstrate that OPPO effectively models offline preferences and outperforms prior competing baselines, including offline RL algorithms performed over either true or pseudo reward function specifications. Our code is available on the project website: https://sites.google.com/view/oppo-icml-2023

    VGDiffZero: Text-to-image Diffusion Models Can Be Zero-shot Visual Grounders

    Full text link
    Large-scale text-to-image diffusion models have shown impressive capabilities across various generative tasks, enabled by strong vision-language alignment obtained through pre-training. However, most vision-language discriminative tasks require extensive fine-tuning on carefully-labeled datasets to acquire such alignment, with great cost in time and computing resources. In this work, we explore directly applying a pre-trained generative diffusion model to the challenging discriminative task of visual grounding without any fine-tuning and additional training dataset. Specifically, we propose VGDiffZero, a simple yet effective zero-shot visual grounding framework based on text-to-image diffusion models. We also design a comprehensive region-scoring method considering both global and local contexts of each isolated proposal. Extensive experiments on RefCOCO, RefCOCO+, and RefCOCOg show that VGDiffZero achieves strong performance on zero-shot visual grounding

    PSSA: PCA-domain superpixelwise singular spectral analysis for unsupervised hyperspectral image classification.

    Get PDF
    Although supervised classification of hyperspectral images (HSI) has achieved success in remote sensing, its applications in real scenarios are often constrained, mainly due to the insufficiently available or lack of labelled data. As a result, unsupervised HSI classification based on data clustering is highly desired, yet it generally suffers from high computational cost and low classification accuracy, especially in large datasets. To tackle these challenges, a novel unsupervised spatial-spectral HSI classification method is proposed. By combining the entropy rate superpixel segmentation (ERS), superpixel-based principal component analysis (PCA), and PCA-domain 2D singular spectral analysis (SSA), both the efficacy and efficiency of feature extraction are improved, followed by the anchor-based graph clustering (AGC) for effective classification. Experiments on three publicly available and five self-collected aerial HSI datasets have fully demonstrated the efficacy of the proposed PCA-domain superpixelwise SSA (PSSA) method, with a gain of 15–20% in terms of the overall accuracy, in comparison to a few state-of-the-art methods. In addition, as an extra outcome, the HSI dataset we acquired is provided freely online

    Sequential multiple assignment randomization trials with enrichment design: Sequential Multiple Assignment Randomization Trials with Enrichment Design

    Get PDF
    Sequential multiple assignment randomization trial (SMART) is a powerful design to study Dynamic Treatment Regimes (DTRs) and allows causal comparisons of DTRs. To handle practical challenges of SMART, we propose a SMART with Enrichment (SMARTer) design, which performs stage-wise enrichment for SMART. SMARTer can improve design efficiency, shorten the recruitment period, and partially reduce trial duration to make SMART more practical with limited time and resource. Specifically, at each subsequent stage of a SMART, we enrich the study sample with new patients who have received previous stages’ treatments in a naturalistic fashion without randomization, and only randomize them among the current stage treatment options. One extreme case of the SMARTer is to synthesize separate independent single-stage randomized trials with patients who have received previous stage treatments. We show data from SMARTer allows for unbiased estimation of DTRs as SMART does under certain assumptions. Furthermore, we show analytically that the efficiency gain of the new design over SMART can be significant especially when the dropout rate is high. Lastly, extensive simulation studies are performed to demonstrate performance of SMARTer design, and sample size estimation in a scenario informed by real data from a SMART study is presented
    • …
    corecore