160 research outputs found

    Computational Protein Design Using AND/OR Branch-and-Bound Search

    Full text link
    The computation of the global minimum energy conformation (GMEC) is an important and challenging topic in structure-based computational protein design. In this paper, we propose a new protein design algorithm based on the AND/OR branch-and-bound (AOBB) search, which is a variant of the traditional branch-and-bound search algorithm, to solve this combinatorial optimization problem. By integrating with a powerful heuristic function, AOBB is able to fully exploit the graph structure of the underlying residue interaction network of a backbone template to significantly accelerate the design process. Tests on real protein data show that our new protein design algorithm is able to solve many prob- lems that were previously unsolvable by the traditional exact search algorithms, and for the problems that can be solved with traditional provable algorithms, our new method can provide a large speedup by several orders of magnitude while still guaranteeing to find the global minimum energy conformation (GMEC) solution.Comment: RECOMB 201

    FreePSI: an alignment-free approach to estimating exon-inclusion ratios without a reference transcriptome.

    Get PDF
    Alternative splicing plays an important role in many cellular processes of eukaryotic organisms. The exon-inclusion ratio, also known as percent spliced in, is often regarded as one of the most effective measures of alternative splicing events. The existing methods for estimating exon-inclusion ratios at the genome scale all require the existence of a reference transcriptome. In this paper, we propose an alignment-free method, FreePSI, to perform genome-wide estimation of exon-inclusion ratios from RNA-Seq data without relying on the guidance of a reference transcriptome. It uses a novel probabilistic generative model based on k-mer profiles to quantify the exon-inclusion ratios at the genome scale and an efficient expectation-maximization algorithm based on a divide-and-conquer strategy and ultrafast conjugate gradient projection descent method to solve the model. We compare FreePSI with the existing methods on simulated and real RNA-seq data in terms of both accuracy and efficiency and show that it is able to achieve very good performance even though a reference transcriptome is not provided. Our results suggest that FreePSI may have important applications in performing alternative splicing analysis for organisms that do not have quality reference transcriptomes. FreePSI is implemented in C++ and freely available to the public on GitHub

    Centerline Extraction for Image Segmentation Using Gradient and Direction Vector Flow Active Contours

    Full text link
    In this paper, we propose a fast centerline extraction method to be used for gradient and direction vector flow of active contours. The gradient and direction vector flow is a recently reported active contour model capable of significantly improving the image segmentation performance especially for complex object shape, by seamlessly integrating gradient vector flow and prior directional information. Since the prior directional information is provided by manual line drawing, it can be inconvenient for inexperienced users who might have difficulty in finding the best place to draw the directional lines to achieve the best segmentation performance. This paper describes a method to overcome this problem by automatically extracting centerlines to guide the users for providing the right directional information. Experimental results on synthetic and real images demonstrate the feasibility of the proposed method

    Integrated Governance of Scenarized Space and Community — Reform of Beijing Qianggen Community Service Station and Enlightenment

    Get PDF
    Community governance is significant for the grass-roots governance in China. Micro-governance and micro-reform starting from community service station is a meaningful measure to explore the improvement of grass-roots governance. Focusing on the reform of community service stations in Beijing, this paper, in consideration to the background of service station reform, describes the history, content and characteristics of the reform of comprehensive setting of Qianggen Community on G Subdistrict of Xicheng District, Beijing, in details, and conducts in-depth analysis based on “The Theory of Scenes” and “The Theory of Governance”. The author holds that community service stations, with new roles taken, new scenarios created and new mechanisms shaped after transformation and upgrading, are turned into governance centers that connect multiple parties, respond to needs of residents better and improve the effectiveness of community governance. The reform practice is committed to the generating of scenarized social space, promoting the manifestation of the integrated governance pattern. The author is inspired to consider the issues related to grassroots governance further and to put forward several suggestions for deepening reform

    Universal Sleep Decoder: Aligning awake and sleep neural representation across subjects

    Full text link
    Decoding memory content from brain activity during sleep has long been a goal in neuroscience. While spontaneous reactivation of memories during sleep in rodents is known to support memory consolidation and offline learning, capturing memory replay in humans is challenging due to the absence of well-annotated sleep datasets and the substantial differences in neural patterns between wakefulness and sleep. To address these challenges, we designed a novel cognitive neuroscience experiment and collected a comprehensive, well-annotated electroencephalography (EEG) dataset from 52 subjects during both wakefulness and sleep. Leveraging this benchmark dataset, we developed the Universal Sleep Decoder (USD) to align neural representations between wakefulness and sleep across subjects. Our model achieves up to 16.6% top-1 zero-shot accuracy on unseen subjects, comparable to decoding performances using individual sleep data. Furthermore, fine-tuning USD on test subjects enhances decoding accuracy to 25.9% top-1 accuracy, a substantial improvement over the baseline chance of 6.7%. Model comparison and ablation analyses reveal that our design choices, including the use of (i) an additional contrastive objective to integrate awake and sleep neural signals and (ii) the pretrain-finetune paradigm to incorporate different subjects, significantly contribute to these performances. Collectively, our findings and methodologies represent a significant advancement in the field of sleep decoding

    GUDN: A novel guide network with label reinforcement strategy for extreme multi-label text classification

    Full text link
    In natural language processing, extreme multi-label text classification is an emerging but essential task. The problem of extreme multi-label text classification (XMTC) is to recall some of the most relevant labels for a text from an extremely large label set. Large-scale pre-trained models have brought a new trend to this problem. Though the large-scale pre-trained models have made significant achievements on this problem, the valuable fine-tuned methods have yet to be studied. Though label semantics have been introduced in XMTC, the vast semantic gap between texts and labels has yet to gain enough attention. This paper builds a new guide network (GUDN) to help fine-tune the pre-trained model to instruct classification later. Furthermore, GUDN uses raw label semantics combined with a helpful label reinforcement strategy to effectively explore the latent space between texts and labels, narrowing the semantic gap, which can further improve predicted accuracy. Experimental results demonstrate that GUDN outperforms state-of-the-art methods on Eurlex-4k and has competitive results on other popular datasets. In an additional experiment, we investigated the input lengths' influence on the Transformer-based model's accuracy. Our source code is released at https://t.hk.uy/aFSH.Comment: 12 pages, 6 figure

    Dataset Quantization

    Full text link
    State-of-the-art deep neural networks are trained with large amounts (millions or even billions) of data. The expensive computation and memory costs make it difficult to train them on limited hardware resources, especially for recent popular large language models (LLM) and computer vision models (CV). Recent popular dataset distillation methods are thus developed, aiming to reduce the number of training samples via synthesizing small-scale datasets via gradient matching. However, as the gradient calculation is coupled with the specific network architecture, the synthesized dataset is biased and performs poorly when used for training unseen architectures. To address these limitations, we present dataset quantization (DQ), a new framework to compress large-scale datasets into small subsets which can be used for training any neural network architectures. Extensive experiments demonstrate that DQ is able to generate condensed small datasets for training unseen network architectures with state-of-the-art compression ratios for lossless model training. To the best of our knowledge, DQ is the first method that can successfully distill large-scale datasets such as ImageNet-1k with a state-of-the-art compression ratio. Notably, with 60% data from ImageNet and 20% data from Alpaca's instruction tuning data, the models can be trained with negligible or no performance drop for both vision tasks (including classification, semantic segmentation, and object detection) as well as language tasks (including instruction tuning tasks such as BBH and DROP).Comment: 9 page

    The effect of warming on grassland evapotranspiration partitioning using laser-based isotope monitoring techniques

    Get PDF
    Author's manuscript made available in accordance with the publisher's policy.The proportion of transpiration (T) in total evapotranspiration (ET) is an important parameter that provides insight into the degree of biological influence on the hydrological cycles. Studies addressing the effects of climatic warming on the ecosystem total water balance are scarce, and measured warming effects on the T/ET ratio in field experiments have not been seen in the literature. In this study, we quantified T/ET ratios under ambient and warming treatments in a grassland ecosystem using a stable isotope approach. The measurements were made at a long-term grassland warming site in Oklahoma during the May–June peak growing season of 2011. Chamber-based methods were used to estimate the δ2H isotopic composition of evaporation (δE), transpiration (δT) and the aggregated evapotranspiration (δET). A modified commercial conifer leaf chamber was used for δT, a modified commercial soil chamber was used for δE and a custom built chamber was used for δET. The δE, δET and δT were quantified using both the Keeling plot approach and a mass balance method, with the Craig–Gordon model approach also used to calculate δE. Multiple methods demonstrated no significant difference between control and warming plots for both δET and δT. Though the chamber-based estimates and the Craig–Gordon results diverged by about 12‰, all methods showed that δE was more depleted in the warming plots. This decrease in δE indicates that the evaporation flux as a percentage of total water flux necessarily decreased for δET to remain constant, which was confirmed by field observations. The T/ET ratio in the control treatment was 0.65 or 0.77 and the ratio found in the warming treatment was 0.83 or 0.86, based on the chamber method and the Craig–Gordon approach. Sensitivity analysis of the Craig–Gordon model demonstrates that the warming-induced decrease in soil liquid water isotopic composition is the major factor responsible for the observed δE depletion and the temperature dependent equilibrium effects are minor. Multiple lines of evidence indicate that the increased T/ET ratio under warming is caused mainly by reduced evaporation

    A simple implementation of PML for second-order elastic wave equations

    Get PDF
    Abstract(#br)When modeling time-domain elastic wave propagation in an unbound space, the standard perfectly matched layer (PML) is straightforward for the first-order partial differential equations (PDEs); by contrast, the PML requires tremendous re-constructions of the governing equations in the second-order PDE form, which is however preferable, because of much less memory and time consumption. Therefore, it is imperative to explore a simple implementation of PML for the second-order system. In this work, we first systematically extend the first-order Nearly PML (NPML) technique into second-order systems, implemented by the spectral element and finite difference time-domain algorithms. It merits the following advantages: the simplicity in implementation, by keeping the second-order PDE-based governing equations exactly the same; the efficiency in computation, by introducing a set of auxiliary ordinary differential equations (ODEs). Mathematically, this PML technique effectively hybridizes the second-order PDEs and first-order ODEs, and locally attenuates outgoing waves, thus efficiently avoid either spatial or temporal global convolutions. Numerical experiments demonstrate that the NPML for the second-order PDE has an excellent absorbing performance for elastic, anelastic and anisotropic media in terms of the absorption accuracy, implementation complexity, and computation efficiency
    • …
    corecore