32 research outputs found
Complete Chloroplast Genome Sequence of a Major Invasive Species, Crofton Weed (Ageratina adenophora)
Crofton weed (Ageratina adenophora) is one of the most hazardous invasive plant species, which causes serious economic losses and environmental damages worldwide. However, the sequence resource and genome information of A. adenophora are rather limited, making phylogenetic identification and evolutionary studies very difficult. Here, we report the complete sequence of the A. adenophora chloroplast (cp) genome based on Illumina sequencing.The A. adenophora cp genome is 150, 689 bp in length including a small single-copy (SSC) region of 18, 358 bp and a large single-copy (LSC) region of 84, 815 bp separated by a pair of inverted repeats (IRs) of 23, 755 bp. The genome contains 130 unique genes and 18 duplicated in the IR regions, with the gene content and organization similar to other Asteraceae cp genomes. Comparative analysis identified five DNA regions (ndhD-ccsA, psbI-trnS, ndhF-ycf1, ndhI-ndhG and atpA-trnR) containing parsimony-informative characters higher than 2%, which may be potential informative markers for barcoding and phylogenetic analysis. Repeat structure, codon usage and contraction of the IR were also investigated to reveal the pattern of evolution. Phylogenetic analysis demonstrated a sister relationship between A. adenophora and Guizotia abyssinica and supported a monophyly of the Asterales.We have assembled and analyzed the chloroplast genome of A. adenophora in this study, which was the first sequenced plastome in the Eupatorieae tribe. The complete chloroplast genome information is useful for plant phylogenetic and evolutionary studies within this invasive species and also within the Asteraceae family
Jointly Multiple Hash Learning
Hashing can compress heterogeneous high-dimensional data into compact binary codes while preserving the similarity to facilitate efficient retrieval and storage, and thus hashing has recently received much attention from information retrieval researchers. Most of the existing hashing methods first predefine a fixed length (e.g., 32, 64, or 128 bit) for the hash codes before learning them with this fixed length. However, one sample can be represented by various hash codes with different lengths, and thus there must be some associations and relationships among these different hash codes because they represent the same sample. Therefore, harnessing these relationships will boost the performance of hashing methods. Inspired by this possibility, in this study, we propose a new model jointly multiple hash learning (JMH), which can learn hash codes with multiple lengths simultaneously. In the proposed JMH method, three types of information are used for hash learning, which come from hash codes with different lengths, the original features of the samples and label. In contrast to the existing hashing methods, JMH can learn hash codes with different lengths in one step. Users can select appropriate hash codes for their retrieval tasks according to the requirements in terms of accuracy and complexity. To the best of our knowledge, JMH is one of the first attempts to learn multi-length hash codes simultaneously. In addition, in the proposed model, discrete and closed-form solutions for variables can be obtained by cyclic coordinate descent, thereby making the proposed model much faster during training. Extensive experiments were performed based on three benchmark datasets and the results demonstrated the superior performance of the proposed method
Error Modeling and Sensitivity Analysis of a Five-Axis Machine Tool
Geometric error modeling and its sensitivity analysis are carried out in this paper, which is helpful for precision design of machine tools. Screw theory and rigid body kinematics are used to establish the error model of an RRTTT-type five-axis machine tool, which enables the source errors affecting the compensable and uncompensable pose accuracy of the machine tool to be explicitly separated, thereby providing designers and/or field engineers with an informative guideline for the accuracy improvement by suitable measures, that is, component tolerancing in design, manufacturing, and assembly processes, and error compensation. The sensitivity analysis method is proposed, and the sensitivities of compensable and uncompensable pose accuracies are analyzed. The analysis results will be used for the precision design of the machine tool
Rapid evaluation of machine tools with position-dependent milling stability based on response surface model
The milling stability is one of the important evaluation criterions of dynamic characteristics of machine tools, and it is of great importance for machine tools’ design and manufacturing. The milling stability of machine tools generally varies with the position combinations of moving parts. The traditional milling stability analysis of machine tools is based on some specific positions in the whole workspace of machine tools, and the results are not comprehensive. Furthermore, it is very time-consuming for operation and calculation to complete analysis of multiple positions. A new method to rapidly evaluate the stability of machine tools with position dependence is developed in this article. In this method, the key position combinations of moving parts are set as the samples of calculation to calculate the dynamic characteristics of machine tools with SAMCEF finite element simulation analysis software. Then the minimum critical axial cutting depth of each sample is obtained. The relationship between the position and the value of minimum critical axial cutting depth at any position in the whole workspace can be obtained through established response surface model. The precision of the response surface model is evaluated and the model could be used to rapidly evaluate the milling stability of machine tools with position dependence. With a precision horizontal machining center with box-in-box structure as an example, the value of minimum critical axial cutting depth at any position is shown. This method of rapid evaluation of machine tools with position-dependent stability avoids complicated theoretical calculation, so it can be easily adopted by engineers and technicians in the phase of design process of machine tools
UFold: fast and accurate RNA secondary structure prediction with deep learning
For many RNA molecules, the secondary structure is essential for the correct function of the RNA. Predicting RNA secondary structure from nucleotide sequences is a long-standing problem in genomics, but the prediction performance has reached a plateau over time. Traditional RNA secondary structure prediction algorithms are primarily based on thermodynamic models through free energy minimization, which imposes strong prior assumptions and is slow to run. Here, we propose a deep learning-based method, called UFold, for RNA secondary structure prediction, trained directly on annotated data and base-pairing rules. UFold proposes a novel image-like representation of RNA sequences, which can be efficiently processed by Fully Convolutional Networks (FCNs). We benchmark the performance of UFold on both within- and cross-family RNA datasets. It significantly outperforms previous methods on within-family datasets, while achieving a similar performance as the traditional methods when trained and tested on distinct RNA families. UFold is also able to predict pseudoknots accurately. Its prediction is fast with an inference time of about 160 ms per sequence up to 1500 bp in length. An online web server running UFold is available at https://ufold.ics.uci.edu. Code is available at https://github.com/uci-cbcl/UFold
Integrated analysis of multimodal single-cell data with structural similarity
Multimodal single-cell sequencing technologies provide unprecedented information on cellular heterogeneity from multiple layers of genomic readouts. However, joint analysis of two modalities without properly handling the noise often leads to overfitting of one modality by the other and worse clustering results than vanilla single-modality analysis. How to efficiently utilize the extra information from single cell multi-omics to delineate cell states and identify meaningful signal remains as a significant computational challenge. In this work, we propose a deep learning framework, named SAILERX, for efficient, robust, and flexible analysis of multi-modal single-cell data. SAILERX consists of a variational autoencoder with invariant representation learning to correct technical noises from sequencing process, and a multimodal data alignment mechanism to integrate information from different modalities. Instead of performing hard alignment by projecting both modalities to a shared latent space, SAILERX encourages the local structures of two modalities measured by pairwise similarities to be similar. This strategy is more robust against overfitting of noises, which facilitates various downstream analysis such as clustering, imputation, and marker gene detection. Furthermore, the invariant representation learning part enables SAILERX to perform integrative analysis on both multi- and single-modal datasets, making it an applicable and scalable tool for more general scenarios
SAILER: scalable and accurate invariant representation learning for single-cell ATAC-seq processing and integration.
MotivationSingle-cell sequencing assay for transposase-accessible chromatin (scATAC-seq) provides new opportunities to dissect epigenomic heterogeneity and elucidate transcriptional regulatory mechanisms. However, computational modeling of scATAC-seq data is challenging due to its high dimension, extreme sparsity, complex dependencies and high sensitivity to confounding factors from various sources.ResultsHere, we propose a new deep generative model framework, named SAILER, for analyzing scATAC-seq data. SAILER aims to learn a low-dimensional nonlinear latent representation of each cell that defines its intrinsic chromatin state, invariant to extrinsic confounding factors like read depth and batch effects. SAILER adopts the conventional encoder-decoder framework to learn the latent representation but imposes additional constraints to ensure the independence of the learned representations from the confounding factors. Experimental results on both simulated and real scATAC-seq datasets demonstrate that SAILER learns better and biologically more meaningful representations of cells than other methods. Its noise-free cell embeddings bring in significant benefits in downstream analyses: clustering and imputation based on SAILER result in 6.9% and 18.5% improvements over existing methods, respectively. Moreover, because no matrix factorization is involved, SAILER can easily scale to process millions of cells. We implemented SAILER into a software package, freely available to all for large-scale scATAC-seq data analysis.Availability and implementationThe software is publicly available at https://github.com/uci-cbcl/SAILER.Supplementary informationSupplementary data are available at Bioinformatics online
Adiponectin and adiponectin receptors in common carp (Cyprinus carpio): Tissue distribution and their expressions in response to high-carbohydrate and high-lipid diets
In order to investigate the roles of adiponectin in glucose and lipid metabolism in common carp, the tissue distribution of adiponectin and its receptor genes in common carp were firstly detected in this study, and then the effects of high-carbohydrate (45%) and high-lipid (11%) diets on their expressions were studied in the feeding trial. The results showed that adipoqa and adipoqb mRNA levels were highest in the red muscle, followed by the heart and white muscle. Adiponectin receptor genes were widely expressed in all tested tissues. Two subtypes of adiponectin receptor 1 genes (adipor1a and adipor1b) were expressed at the highest level in the brain, while adipor2 mRNA was highly expressed in the heart and red muscle. The high-carbohydrate diet significantly up-regulated adipoqa and adipoqb mRNA level in the heart of common carp, while the high-lipid diet significantly promoted adipoqa mRNA expression in the red muscle and heart, compared to the control diet. For adiponectin receptor gene expression, a high-carbohydrate diet up-regulated adipor2 mRNA level in the hepatopancreas, while a high-lipid diet significantly promoted adipor1a, adipor1b and adipor2 mRNA expression in the red muscle and hepatopancreas. The results showed that both high-carbohydrate and high-lipid diets could induce adiponectin genes expression in common carp, and AdipoRs expression was more likely to be increased by the high-lipid diet than the high-carbohydrate diet. This study suggested that adiponectin and its receptors were involved in the metabolic regulation of fish under high dietary carbohydrate and lipid levels