22,147 research outputs found

    Sequence-based Multiscale Model (SeqMM) for High-throughput chromosome conformation capture (Hi-C) data analysis

    Full text link
    In this paper, I introduce a Sequence-based Multiscale Model (SeqMM) for the biomolecular data analysis. With the combination of spectral graph method, I reveal the essential difference between the global scale models and local scale ones in structure clustering, i.e., different optimization on Euclidean (or spatial) distances and sequential (or genomic) distances. More specifically, clusters from global scale models optimize Euclidean distance relations. Local scale models, on the other hand, result in clusters that optimize the genomic distance relations. For a biomolecular data, Euclidean distances and sequential distances are two independent variables, which can never be optimized simultaneously in data clustering. However, sequence scale in my SeqMM can work as a tuning parameter that balances these two variables and deliver different clusterings based on my purposes. Further, my SeqMM is used to explore the hierarchical structures of chromosomes. I find that in global scale, the Fiedler vector from my SeqMM bears a great similarity with the principal vector from principal component analysis, and can be used to study genomic compartments. In TAD analysis, I find that TADs evaluated from different scales are not consistent and vary a lot. Particularly when the sequence scale is small, the calculated TAD boundaries are dramatically different. Even for regions with high contact frequencies, TAD regions show no obvious consistence. However, when the scale value increases further, although TADs are still quite different, TAD boundaries in these high contact frequency regions become more and more consistent. Finally, I find that for a fixed local scale, my method can deliver very robust TAD boundaries in different cluster numbers.Comment: 22 PAGES, 13 FIGURE

    On the properties of fractal cloud complexes

    Full text link
    We study the physical properties derived from interstellar cloud complexes having a fractal structure. We first generate fractal clouds with a given fractal dimension and associate each clump with a maximum in the resulting density field. Then, we discuss the effect that different criteria for clump selection has on the derived global properties. We calculate the masses, sizes and average densities of the clumps as a function of the fractal dimension (D_f) and the fraction of the total mass in the form of clumps (epsilon). In general, clump mass does not fulfill a simple power law with size of the type M_cl ~ (R_cl)**(gamma), instead the power changes, from gamma ~ 3 at small sizes to gamma<3 at larger sizes. The number of clumps per logarithmic mass interval can be fitted to a power law N_cl ~ (M_cl)**(-alpha_M) in the range of relatively large masses, and the corresponding size distribution is N_cl ~ (R_cl)**(-alpha_R) at large sizes. When all the mass is forming clumps (epsilon=1) we obtain that as D_f increases from 2 to 3 alpha_M increases from ~0.3 to ~0.6 and alpha_R increases from ~1.0 to ~2.1. Comparison with observations suggests that D_f ~ 2.6 is roughly consistent with the average properties of the ISM. On the other hand, as the fraction of mass in clumps decreases (epsilon<1) alpha_M increases and alpha_R decreases. When only ~10% of the complex mass is in the form of dense clumps we obtain alpha_M ~ 1.2 for D_f=2.6 (not very different from the Salpeter value 1.35), suggesting this a likely link between the stellar initial mass function and the internal structure of molecular cloud complexes.Comment: 32 pages, 13 figures, 1 table. Accepted for publication in Ap

    A convolutional autoencoder approach for mining features in cellular electron cryo-tomograms and weakly supervised coarse segmentation

    Full text link
    Cellular electron cryo-tomography enables the 3D visualization of cellular organization in the near-native state and at submolecular resolution. However, the contents of cellular tomograms are often complex, making it difficult to automatically isolate different in situ cellular components. In this paper, we propose a convolutional autoencoder-based unsupervised approach to provide a coarse grouping of 3D small subvolumes extracted from tomograms. We demonstrate that the autoencoder can be used for efficient and coarse characterization of features of macromolecular complexes and surfaces, such as membranes. In addition, the autoencoder can be used to detect non-cellular features related to sample preparation and data collection, such as carbon edges from the grid and tomogram boundaries. The autoencoder is also able to detect patterns that may indicate spatial interactions between cellular components. Furthermore, we demonstrate that our autoencoder can be used for weakly supervised semantic segmentation of cellular components, requiring a very small amount of manual annotation.Comment: Accepted by Journal of Structural Biolog

    Pattern Formation on Trees

    Full text link
    Networks having the geometry and the connectivity of trees are considered as the spatial support of spatiotemporal dynamical processes. A tree is characterized by two parameters: its ramification and its depth. The local dynamics at the nodes of a tree is described by a nonlinear map, given rise to a coupled map lattice system. The coupling is expressed by a matrix whose eigenvectors constitute a basis on which spatial patterns on trees can be expressed by linear combination. The spectrum of eigenvalues of the coupling matrix exhibit a nonuniform distribution which manifest itself in the bifurcation structure of the spatially synchronized modes. These models may describe reaction-diffusion processes and several other phenomena occurring on heterogeneous media with hierarchical structure.Comment: Submitted to Phys. Rev. E, 15 pages, 9 fig

    A Unified Approach for Representing Structurally-Complex Models in SBML Level 3

    Get PDF
    The aim of this document is to explore a unified approach to handling several of the proposed extensions to the SBML Level 3 Core specification. The approach is illustrated with reference to Simile, a modelling environment which appears to have most of the capabilities of the various SBML Level 3 package proposals which deal with model structure. Simile (http://www.simulistics.com) is a visual modelling environment for continuous systems modelling which includes the ability to handle complex disaggregation of model structure, by allowing the modeller to specify classes of object and the relationships between them.&#xd;&#xa;&#xd;&#xa;The note is organised around the 6 packages listed on the SBML Level 3 Proposals web page (http://sbml.org/Community/Wiki/SBML_Level_3_Proposals) which deal with model structure, namely comp, arrays, spatial, geom, dyn and multi. For each one, I consider how the requirements which motivated the package can be handled using Simile&#x27;s unified approach. Although Simile has a declarative model-representation language (in both Prolog and XML syntax), I use Simile diagrams and equation syntax throughout, since this is more compact and readable than large chunks of XML.&#xd;&#xa;&#xd;&#xa;The conclusion is that Simile can indeed meet most of the requirements of these various packages, using a generic set of constructs - basically, the multiple-instance submodel, the concept of a relationship (association) between submodels, and array variables. This suggests the possibility of having a single SBML Level 3 extension package similar to the Simile data model, rather than a series of separate packages. Such an approach has a number of potential advantages and disadvantages compared with having the current set of discrete packages: these are discussed in this paper

    Combinatorial Gradient Fields for 2D Images with Empirically Convergent Separatrices

    Full text link
    This paper proposes an efficient probabilistic method that computes combinatorial gradient fields for two dimensional image data. In contrast to existing algorithms, this approach yields a geometric Morse-Smale complex that converges almost surely to its continuous counterpart when the image resolution is increased. This approach is motivated using basic ideas from probability theory and builds upon an algorithm from discrete Morse theory with a strong mathematical foundation. While a formal proof is only hinted at, we do provide a thorough numerical evaluation of our method and compare it to established algorithms.Comment: 17 pages, 7 figure
    corecore