8 research outputs found
Subharmonic Almost Periodic Functions of Slow Growth
We obtain a complete description of the Riesz measures of almost periodic subharmonic functions with at most of linear growth on C. As a consequence we get a complete description of zero sets for the class of entire functions of exponential type with almost periodic modula
Subharmonic almost periodic functions
We prove that almost periodicity in the sense of distributions coincides with almost periodicity with respect to Stepanov's metric for the class of subharmonic functions in a strip {z belongs C : a < Imz < b}. We also prove that Fourier coefficients of these functions are continuous functions in Imz. Further, if the logarithm of a subharmonic almost periodic function is a subharmonic function, then it is almost periodic
Discovery of widespread transcription initiation at microsatellites predictable by sequence-based deep neural network
Using the Cap Analysis of Gene Expression (CAGE) technology, the FANTOM5 consortium provided one of the most comprehensive maps of transcription start sites (TSSs) in several species. Strikingly, ~72% of them could not be assigned to a specific gene and initiate at unconventional regions, outside promoters or enhancers. Here, we probe these unassigned TSSs and show that, in all species studied, a significant fraction of CAGE peaks initiate at microsatellites, also called short tandem repeats (STRs). To confirm this transcription, we develop Cap Trap RNA-seq, a technology which combines cap trapping and long read MinION sequencing. We train sequence-based deep learning models able to predict CAGE signal at STRs with high accuracy. These models unveil the importance of STR surrounding sequences not only to distinguish STR classes, but also to predict the level of transcription initiation. Importantly, genetic variants linked to human diseases are preferentially found at STRs with high transcription initiation level, supporting the biological and clinical relevance of transcription initiation at STRs. Together, our results extend the repertoire of non-coding transcription associated with DNA tandem repeats and complexify STR polymorphism
A Gibbs sampler for identification of symmetrically structured, spaced DNA motifs with improved estimation of the signal length
Motivation: Transcription regulatory protein factors often bind DNA
as homo-dimers or hetero-dimers. Thus they recognize structured
DNA motifs that are inverted or direct repeats or spaced motif
pairs. However, these motifs are often difficult to identify owing to
their high divergence. The motif structure included explicitly into
the motif recognition algorithm improves recognition efficiency for
highly divergent motifs as well as estimation of motif geometric
parameters.
Result: We present a modification of the Gibbs sampling motif extraction
algorithm, SeSiMCMC (Sequence Similarities by Markov Chain
Monte Carlo), which finds structured motifs of these types, as well
as non-structured motifs, in a set of unaligned DNA sequences. It
employs improved estimators of motif and spacer lengths. The probability
that a sequence does not contain any motif is accounted for in a
rigorous Bayesian manner. We have applied the algorithm to a set of
upstream regions of genes from two Escherichia coli regulons involved
in respiration. We have demonstrated that accounting for a symmetric
motif structure allows the algorithm to identify weak motifs more accurately.
In the examples studied, ArcA binding sites were demonstrated
to have the structure of a direct spaced repeat, whereas NarP binding
sites exhibited the palindromic structure.
Availability: The WWW interface of the program, its FreeBSD (4.0) and Windows 32 console executables are available at http://bioinform.genetika.ru/SeSiMCM
A promoter-level mammalian expression atlas
Regulated transcription controls the diversity, developmental pathways and spatial organization of the hundreds of cell types that make up a mammal. Using single-molecule cDNA sequencing, we mapped transcription start sites (TSSs) and their usage in human and mouse primary cells, cell lines and tissues to produce a comprehensive overview of mammalian gene expression across the human body. We find that few genes are truly ‘housekeeping’, whereas many mammalian promoters are composite entities composed of several closely separated TSSs, with independent cell-type-specific expression profiles. TSSs specific to different cell types evolve at different rates, whereas promoters of broadly expressed genes are the most conserved. Promoter-based expression analysis reveals key transcription factors defining cell states and links them to binding-site motifs. The functions of identified novel transcripts can be predicted by coexpression and sample ontology enrichment analyses. The functional annotation of the mammalian genome 5 (FANTOM5) project provides comprehensive expression profiles and functional annotation of mammalian cell-type-specific transcriptomes with wide applications in biomedical research
Discovery of widespread transcription initiation at microsatellites predictable by sequence-based deep neural network
10.1038/s41467-021-23143-7Nature Communications121329