201 research outputs found

    Accelerated and Deep Expectation Maximization for One-Bit MIMO-OFDM Detection

    Full text link
    In this paper we study the expectation maximization (EM) technique for one-bit MIMO-OFDM detection (OMOD). Arising from the recent interest in massive MIMO with one-bit analog-to-digital converters, OMOD is a massive-scale problem. EM is an iterative method that can exploit the OFDM structure to process the problem in a per-iteration efficient fashion. In this study we analyze the convergence rate of EM for a class of approximate maximum-likelihood OMOD formulations, or, in a broader sense, a class of problems involving regression from quantized data. We show how the SNR and channel conditions can have an impact on the convergence rate. We do so by making a connection between the EM and the proximal gradient methods in the context of OMOD. This connection also gives us insight to build new accelerated and/or inexact EM schemes. The accelerated scheme has faster convergence in theory, and the inexact scheme provides us with the flexibility to implement EM more efficiently, with convergence guarantee. Furthermore we develop a deep EM algorithm, wherein we take the structure of our inexact EM algorithm and apply deep unfolding to train an efficient structured deep net. Simulation results show that our accelerated exact/inexact EM algorithms run much faster than their standard EM counterparts, and that the deep EM algorithm gives promising detection and runtime performances

    Open-World Multi-Task Control Through Goal-Aware Representation Learning and Adaptive Horizon Prediction

    Full text link
    We study the problem of learning goal-conditioned policies in Minecraft, a popular, widely accessible yet challenging open-ended environment for developing human-level multi-task agents. We first identify two main challenges of learning such policies: 1) the indistinguishability of tasks from the state distribution, due to the vast scene diversity, and 2) the non-stationary nature of environment dynamics caused by partial observability. To tackle the first challenge, we propose Goal-Sensitive Backbone (GSB) for the policy to encourage the emergence of goal-relevant visual state representations. To tackle the second challenge, the policy is further fueled by an adaptive horizon prediction module that helps alleviate the learning uncertainty brought by the non-stationary dynamics. Experiments on 20 Minecraft tasks show that our method significantly outperforms the best baseline so far; in many of them, we double the performance. Our ablation and exploratory studies then explain how our approach beat the counterparts and also unveil the surprising bonus of zero-shot generalization to new scenes (biomes). We hope our agent could help shed some light on learning goal-conditioned, multi-task agents in challenging, open-ended environments like Minecraft.Comment: This paper is accepted by CVPR202

    Correlation-Aware Mutual Learning for Semi-supervised Medical Image Segmentation

    Full text link
    Semi-supervised learning has become increasingly popular in medical image segmentation due to its ability to leverage large amounts of unlabeled data to extract additional information. However, most existing semi-supervised segmentation methods only focus on extracting information from unlabeled data, disregarding the potential of labeled data to further improve the performance of the model. In this paper, we propose a novel Correlation Aware Mutual Learning (CAML) framework that leverages labeled data to guide the extraction of information from unlabeled data. Our approach is based on a mutual learning strategy that incorporates two modules: the Cross-sample Mutual Attention Module (CMA) and the Omni-Correlation Consistency Module (OCC). The CMA module establishes dense cross-sample correlations among a group of samples, enabling the transfer of label prior knowledge to unlabeled data. The OCC module constructs omni-correlations between the unlabeled and labeled datasets and regularizes dual models by constraining the omni-correlation matrix of each sub-model to be consistent. Experiments on the Atrial Segmentation Challenge dataset demonstrate that our proposed approach outperforms state-of-the-art methods, highlighting the effectiveness of our framework in medical image segmentation tasks. The codes, pre-trained weights, and data are publicly available.Comment: MICCAI2023 early accepted, camera ready versio

    〈論文〉日中比較からみる日本古代朝政の特色

    Get PDF
    「朝政」という言葉は、一般的に「朝廷の政治」という意味で古代の政治を表す用語である。従来、日本古代の朝政は、漠然として「朝堂政」や「あさまつりごと」と捉えられてきたことが多い。本論文では、日中比較の観点から朝政の語源に注意しながら、日中の史料に見られる朝政の用例を分析し、日本古代の朝政を「みかどのまつりごと」を定義した上で、朝政成立史上の推古朝の画期性を指摘した。Chōsei is generally a term expressing ancient politics in the sense of "politics of the court". Traditionally, it has been vaguely captured as chōdōsei or asamatsurigoto. In this paper, from the viewpoint of Japan-China comparison, I analyze examples of chōsei found in Japanese-Chinese historical records, while paying attention to the origin of chōsei, trying to define chōsei as "political affairs of the court," and point out the innovation of the period of Emperor Suiko in the history of chōsei

    A two-stage framework for optical coherence tomography angiography image quality improvement

    Get PDF
    IntroductionOptical Coherence Tomography Angiography (OCTA) is a new non-invasive imaging modality that gains increasing popularity for the observation of the microvasculatures in the retina and the conjunctiva, assisting clinical diagnosis and treatment planning. However, poor imaging quality, such as stripe artifacts and low contrast, is common in the acquired OCTA and in particular Anterior Segment OCTA (AS-OCTA) due to eye microtremor and poor illumination conditions. These issues lead to incomplete vasculature maps that in turn makes it hard to make accurate interpretation and subsequent diagnosis.MethodsIn this work, we propose a two-stage framework that comprises a de-striping stage and a re-enhancing stage, with aims to remove stripe noise and to enhance blood vessel structure from the background. We introduce a new de-striping objective function in a Stripe Removal Net (SR-Net) to suppress the stripe noise in the original image. The vasculatures in acquired AS-OCTA images usually exhibit poor contrast, so we use a Perceptual Structure Generative Adversarial Network (PS-GAN) to enhance the de-striped AS-OCTA image in the re-enhancing stage, which combined cyclic perceptual loss with structure loss to achieve further image quality improvement.Results and discussionTo evaluate the effectiveness of the proposed method, we apply the proposed framework to two synthetic OCTA datasets and a real AS-OCTA dataset. Our results show that the proposed framework yields a promising enhancement performance, which enables both conventional and deep learning-based vessel segmentation methods to produce improved results after enhancement of both retina and AS-OCTA modalities

    GROOT: Learning to Follow Instructions by Watching Gameplay Videos

    Full text link
    We study the problem of building a controller that can follow open-ended instructions in open-world environments. We propose to follow reference videos as instructions, which offer expressive goal specifications while eliminating the need for expensive text-gameplay annotations. A new learning framework is derived to allow learning such instruction-following controllers from gameplay videos while producing a video instruction encoder that induces a structured goal space. We implement our agent GROOT in a simple yet effective encoder-decoder architecture based on causal transformers. We evaluate GROOT against open-world counterparts and human players on a proposed Minecraft SkillForge benchmark. The Elo ratings clearly show that GROOT is closing the human-machine gap as well as exhibiting a 70% winning rate over the best generalist agent baseline. Qualitative analysis of the induced goal space further demonstrates some interesting emergent properties, including the goal composition and complex gameplay behavior synthesis. The project page is available at https://craftjarvis-groot.github.io

    Theoretical prediction of diffusive ionic current through nanopores under salt gradients

    Full text link
    In charged nanopores, ionic diffusion current reflects the ionic selectivity and ionic permeability of nanopores which determines the performance of osmotic energy conversion, i.e. the output power and efficiency. Here, theoretical predictions of the diffusive currents through cation-selective nanopores have been developed based on the investigation of diffusive ionic transport under salt gradients with simulations. The ionic diffusion current I satisfies a reciprocal relationship with the pore length I correlates with a/L (a is a constant) in long nanopores. a is determined by the cross-sectional areas of diffusion paths for anions and cations inside nanopores which can be described with a quadratic power of the diameter, and the superposition of a quadratic power and a first power of the diameter, respectively. By using effective concentration gradients instead of nominal ones, the deviation caused by the concentration polarization can be effectively avoided in the prediction of ionic diffusion current. With developed equations of effective concentration difference and ionic diffusion current, the diffusion current across nanopores can be well predicted in cases of nanopores longer than 100 nm and without overlapping of electric double layers. Our results can provide a convenient way for the quantitative prediction of ionic diffusion currents under salt gradients

    Describe, Explain, Plan and Select: Interactive Planning with Large Language Models Enables Open-World Multi-Task Agents

    Full text link
    We investigate the challenge of task planning for multi-task embodied agents in open-world environments. Two main difficulties are identified: 1) executing plans in an open-world environment (e.g., Minecraft) necessitates accurate and multi-step reasoning due to the long-term nature of tasks, and 2) as vanilla planners do not consider how easy the current agent can achieve a given sub-task when ordering parallel sub-goals within a complicated plan, the resulting plan could be inefficient or even infeasible. To this end, we propose "D\underline{D}escribe, E\underline{E}xplain, P\underline{P}lan and S\underline{S}elect" (DEPS\textbf{DEPS}), an interactive planning approach based on Large Language Models (LLMs). DEPS facilitates better error correction on initial LLM-generated plan\textit{plan} by integrating description\textit{description} of the plan execution process and providing self-explanation\textit{explanation} of feedback when encountering failures during the extended planning phases. Furthermore, it includes a goal selector\textit{selector}, which is a trainable module that ranks parallel candidate sub-goals based on the estimated steps of completion, consequently refining the initial plan. Our experiments mark the milestone of the first zero-shot multi-task agent that can robustly accomplish 70+ Minecraft tasks and nearly double the overall performances. Further testing reveals our method's general effectiveness in popularly adopted non-open-ended domains as well (i.e., ALFWorld and tabletop manipulation). The ablation and exploratory studies detail how our design beats the counterparts and provide a promising update on the ObtainDiamond\texttt{ObtainDiamond} grand challenge with our approach. The code is released at https://github.com/CraftJarvis/MC-Planner.Comment: NeurIPS 202

    Factorized Contrastive Learning: Going Beyond Multi-view Redundancy

    Full text link
    In a wide range of multimodal tasks, contrastive learning has become a particularly appealing approach since it can successfully learn representations from abundant unlabeled data with only pairing information (e.g., image-caption or video-audio pairs). Underpinning these approaches is the assumption of multi-view redundancy - that shared information between modalities is necessary and sufficient for downstream tasks. However, in many real-world settings, task-relevant information is also contained in modality-unique regions: information that is only present in one modality but still relevant to the task. How can we learn self-supervised multimodal representations to capture both shared and unique information relevant to downstream tasks? This paper proposes FactorCL, a new multimodal representation learning method to go beyond multi-view redundancy. FactorCL is built from three new contributions: (1) factorizing task-relevant information into shared and unique representations, (2) capturing task-relevant information via maximizing MI lower bounds and removing task-irrelevant information via minimizing MI upper bounds, and (3) multimodal data augmentations to approximate task relevance without labels. On large-scale real-world datasets, FactorCL captures both shared and unique information and achieves state-of-the-art results on six benchmarks.Comment: Code available at: https://github.com/pliang279/FactorC
    corecore