88 research outputs found

    On the clustering property of the random intersection graphs

    Get PDF
    A random intersection graph \mtl{\mcal{G}_{V,W,p}} is induced from a random bipartite graph \mtl{\mcal{G}^{*}_{V,W,p}} with vertices classes \mtl{V}, \mtl{W} and the edges incident between \mtl{v \in V} and \mtl{w \in W} with probability \mtl{p}. Two vertices in \mtl{V} are considered to be connected with each other if both of them connect with some common vertices in \mtl{W}. The clustering properties of the random intersection graph are investigated completely in this article. Suppose that the vertices number be \mtl{N = \mabs{V}} and \mtl{M=\mabs{W}} and \mtl{M = N^{\alpha},\ p=N^{-\beta}}, where \mtl{\alpha > 0,\, \beta > 0}, we derive the exact expressions of the clustering coefficient \mtl{C_{v}} of vertex \mtl{v} in \mtl{\mcal{G}_{V,W,p}}. The results show that if \mtl{\alpha < 2\beta} and \mtl{\alpha \neq \beta}, \mtl{C_{v}} decreases with the increasing of the graph size; if \mtl{\alpha = \beta} or \mtl{\alpha \geq 2\beta}, the graph has the constant clustering coefficients, in addition, if \mtl{\alpha > 2\beta}, the graph connecChangshui Zhangts almost completely. Therefore, we illustrate the phase transition for the clustering property in the random intersection graphs and give the condition that \mtl{\riG} being high clustering graph

    Identifications of conserved 7-mers in 3'-UTRs and microRNAs in Drosophila

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>MicroRNAs (miRNAs) are a class of endogenous regulatory small RNAs which play an important role in posttranscriptional regulations by targeting mRNAs for cleavage or translational repression. The base-pairing between the 5'-end of miRNA and the target mRNA 3'-UTRs is essential for the miRNA:mRNA recognition. Recent studies show that many seed matches in 3'-UTRs, which are fully complementary to miRNA 5'-ends, are highly conserved. Based on these features, a two-stage strategy can be implemented to achieve the <it>de novo </it>identification of miRNAs by requiring the complete base-pairing between the 5'-end of miRNA candidates and the potential seed matches in 3'-UTRs.</p> <p>Results</p> <p>We presented a new method, which combined multiple pairwise conservation information, to identify the frequently-occurred and conserved 7-mers in 3'-UTRs. A pairwise conservation score (PCS) was introduced to describe the conservation of all 7-mers in 3'-UTRs between any two <it>Drosophila </it>species. Using PCSs computed from 6 pairs of flies, we developed a support vector machine (SVM) classifier ensemble, named Cons-SVM and identified 689 conserved 7-mers including 63 seed matches covering 32 out of 38 known miRNA families in the reference dataset. In the second stage, we searched for 90 nt conserved stem-loop regions containing the complementary sequences to the identified 7-mers and used the previously published miRNA prediction software to analyze these stem-loops. We predicted 47 miRNA candidates in the genome-wide screen.</p> <p>Conclusion</p> <p>Cons-SVM takes advantage of the independent evolutionary information from the 6 pairs of flies and shows high sensitivity in identifying seed matches in 3'-UTRs. Combining the multiple pairwise conservation information by the machine learning approach, we finally identified 47 miRNA candidates in <it>D. melanogaster</it>.</p

    EsATAC: an easy-to-use systematic pipeline for ATAC-seq data analysis

    Get PDF
    Summary ATAC-seq is rapidly emerging as one of the major experimental approaches to probe chromatin accessibility genome-wide. Here, we present ‘esATAC’, a highly integrated easy-to-use R/Bioconductor package, for systematic ATAC-seq data analysis. It covers essential steps for full analyzing procedure, including raw data processing, quality control and downstream statistical analysis such as peak calling, enrichment analysis and transcription factor footprinting. esATAC supports one command line execution for preset pipelines and provides flexible interfaces for building customized pipelines. Availability and implementation esATAC package is open source under the GPL-3.0 license. It is implemented in R and C++. Source code and binaries for Linux, MAC OS X and Windows are available through Bioconductor (https://www.bioconductor.org/packages/release/bioc/html/esATAC.html). Supplementary information Supplementary data are available at Bioinformatics online. Document type: Articl

    Functional importance of different patterns of correlation between adjacent cassette exons in human and mouse

    Get PDF
    <p>Abstract</p> <p>Background</p> <p>Alternative splicing expands transcriptome diversity and plays an important role in regulation of gene expression. Previous studies focus on the regulation of a single cassette exon, but recent experiments indicate that multiple cassette exons within a gene may interact with each other. This interaction can increase the potential to generate various transcripts and adds an extra layer of complexity to gene regulation. Several cases of exon interaction have been discovered. However, the extent to which the cassette exons coordinate with each other remains unknown.</p> <p>Results</p> <p>Based on EST data, we employed a metric of correlation coefficients to describe the interaction between two adjacent cassette exons and then categorized these exon pairs into three different groups by their interaction (correlation) patterns. Sequence analysis demonstrates that strongly-correlated groups are more conserved and contain a higher proportion of pairs with reading frame preservation in a combinatorial manner. Multiple genome comparison further indicates that different groups of correlated pairs have different evolutionary courses: (1) The vast majority of positively-correlated pairs are old, (2) most of the weakly-correlated pairs are relatively young, and (3) negatively-correlated pairs are a mixture of old and young events.</p> <p>Conclusion</p> <p>We performed a large-scale analysis of interactions between adjacent cassette exons. Compared with weakly-correlated pairs, the strongly-correlated pairs, including both the positively and negatively correlated ones, show more evidence that they are under delicate splicing control and tend to be functionally important. Additionally, the positively-correlated pairs bear strong resemblance to constitutive exons, which suggests that they may evolve from ancient constitutive exons, while negatively and weakly correlated pairs are more likely to contain newly emerging exons.</p

    Quantum oscillations revealing topological band in kagome metal ScV6Sn6

    Full text link
    Compounds with kagome lattice structure are known to exhibit Dirac cones, flat bands, and van Hove singularities, which host numerous versatile quantum phenomena. Inspired by these intriguing properties, we investigate the temperature and magnetic field dependent electrical transports along with the theoretical calculations of ScV6Sn6, a nonmagnetic charge density wave (CDW) compound. At low temperatures, the compound exhibits Shubnikov-de Haas quantum oscillations, which help to design the Fermi surface (FS) topology. This analysis reveals the existence of several small FSs in the Brillouin zone, combined with a large FS. Among them, the FS possessing Dirac band is a non-trivial and generates a non-zero Berry phase. In addition, the compound also shows the anomalous Hall-like behaviour up to the CDW with the CDW phase, ScV6Sn6 presents a unique material example of the versatile HfFe6Ge6 family and provides various promising opportunities to explore the series further.Comment: Published version, 19 Pages, 5 figures with supplementar

    StableLLaVA: Enhanced Visual Instruction Tuning with Synthesized Image-Dialogue Data

    Full text link
    The remarkable multimodal capabilities demonstrated by OpenAI's GPT-4 have sparked significant interest in the development of multimodal Large Language Models (LLMs). A primary research objective of such models is to align visual and textual modalities effectively while comprehending human instructions. Current methodologies often rely on annotations derived from benchmark datasets to construct image-dialogue datasets for training purposes, akin to instruction tuning in LLMs. However, these datasets often exhibit domain bias, potentially constraining the generative capabilities of the models. In an effort to mitigate these limitations, we propose a novel data collection methodology that synchronously synthesizes images and dialogues for visual instruction tuning. This approach harnesses the power of generative models, marrying the abilities of ChatGPT and text-to-image generative models to yield a diverse and controllable dataset with varied image content. This not only provides greater flexibility compared to existing methodologies but also significantly enhances several model capabilities. Our research includes comprehensive experiments conducted on various datasets using the open-source LLAVA model as a testbed for our proposed pipeline. Our results underscore marked enhancements across more than ten commonly assessed capabilities,Comment: Project page: https://github.com/icoz69/StableLLAV

    The role of sediment-induced light attenuation on primary production during Hurricane Gustav (2008)

    Get PDF
    © The Author(s), 2020. This article is distributed under the terms of the Creative Commons Attribution License. The definitive version was published in Zang, Z., Xue, Z. G., Xu, K., Bentley, S. J., Chen, Q., D'Sa, E. J., Zhang, L., & Ou, Y. The role of sediment-induced light attenuation on primary production during Hurricane Gustav (2008). Biogeosciences, 17(20), (2020): 5043-5055, doi:10.5194/bg-17-5043-2020.We introduced a sediment-induced light attenuation algorithm into a biogeochemical model of the Coupled Ocean–Atmosphere–Wave–Sediment Transport (COAWST) modeling system. A fully coupled ocean–atmospheric–sediment–biogeochemical simulation was carried out to assess the impact of sediment-induced light attenuation on primary production in the northern Gulf of Mexico during the passage of Hurricane Gustav in 2008. When compared with model results without sediment-induced light attenuation, our new model showed a better agreement with satellite data on both the magnitude of nearshore chlorophyll concentration and the spatial distribution of offshore bloom. When Hurricane Gustav approached, resuspended sediment shifted the inner shelf ecosystem from a nutrient-limited one to a light-limited one. Only 1 week after Hurricane Gustav's landfall, accumulated nutrients and a favorable optical environment induced a posthurricane algal bloom in the top 20 m of the water column, while the productivity in the lower water column was still light-limited due to slow-settling sediment. Corresponding with the elevated offshore NO3 flux (38.71 mmol N m−1 s−1) and decreased chlorophyll flux (43.10 mg m−1 s−1), the outer shelf posthurricane bloom should have resulted from the cross-shelf nutrient supply instead of the lateral dispersed chlorophyll. Sensitivity tests indicated that sediment light attenuation efficiency affected primary production when sediment concentration was moderately high. Model uncertainties due to colored dissolved organic matter and parameterization of sediment-induced light attenuation are also discussed.This research has been supported by the National Science Foundation (grant nos. CCF-1856359, EnvS-1903340, OCE-1635837 and EAR-1427389), NASA (grant no. NNH17ZHA002C), the Louisiana Board of Regents (grant no. NASA/LEQSF(2018-20)-Phase3-11) and the LSU Foundation Billy and Ann Harrison Endowment for Sedimentary Geology

    Transcriptomic Analysis of Seed Germination Under Salt Stress in Two Desert Sister Species (Populus euphratica and P. pruinosa)

    Get PDF
    As a major abiotic stress, soil salinity limits seed germination and plant growth, development and production. Seed germination is highly related not only to the seedlings survival rate but also subsequent vegetative growth. Populus euphratica and P. pruinosa are closely related species that show a distinguished adaptability to salinity stress. In this study, we performed an integrative transcriptome analyses of three seed germination phases from P. euphratica and P. pruinosa under salt stress. A two-dimensional data set of this study provides a comprehensive view of the dynamic biochemical processes that underpin seed germination and salt tolerance. Our analysis identified 12831 differentially expressed genes (DEGs) for seed germination processes and 8071 DEGs for salt tolerance in the two species. Furthermore, we identified the expression profiles and main pathways in each growth phase. For seed germination, a large number of DEGs, including those involved in energy production and hormonal regulation pathways, were transiently and specifically induced in the late phase. In the comparison of salt tolerance between the two species, the flavonoid and brassinosteroid pathways were significantly enriched. More specifically, in the flavonoid pathway, FLS and F3′5′H exhibited significant differential expression. In the brassinosteroid pathway, the expression levels of DWF4, BR6OX2 and ROT3 were notably higher in P. pruinosa than in P. euphratica. Our results describe transcript dynamics and highlight secondary metabolite pathways involved in the response to salt stress during the seed germination of two desert poplars

    Spatial Uncertainty-Aware Semi-Supervised Crowd Counting

    Get PDF
    Semi-supervised approaches for crowd counting attract attention, as the fully supervised paradigm is expensive and laborious due to its request for a large number of images of dense crowd scenarios and their annotations. This paper proposes a spatial uncertainty-aware semi-supervised approach via regularized surrogate task (binary segmentation) for crowd counting problems. Different from existing semi-supervised learning-based crowd counting methods, to exploit the unlabeled data, our proposed spatial uncertainty-aware teacher-student framework focuses on high confident regions' information while addressing the noisy supervision from the unlabeled data in an end-to-end manner. Specifically, we estimate the spatial uncertainty maps from the teacher model's surrogate task to guide the feature learning of the main task (density regression) and the surrogate task of the student model at the same time. Besides, we introduce a simple yet effective differential transformation layer to enforce the inherent spatial consistency regularization between the main task and the surrogate task in the student model, which helps the surrogate task to yield more reliable predictions and generates high-quality uncertainty maps. Thus, our model can also address the task-level perturbation problems that occur spatial inconsistency between the primary and surrogate tasks in the student model. Experimental results on four challenging crowd counting datasets demonstrate that our method achieves superior performance to the state-of-the-art semi-supervised methods
    • …
    corecore