7 research outputs found
Statistical inference framework for source detection of contagion processes on arbitrary network structures
In this paper we introduce a statistical inference framework for estimating
the contagion source from a partially observed contagion spreading process on
an arbitrary network structure. The framework is based on a maximum likelihood
estimation of a partial epidemic realization and involves large scale
simulation of contagion spreading processes from the set of potential source
locations. We present a number of different likelihood estimators that are used
to determine the conditional probabilities associated to observing partial
epidemic realization with particular source location candidates. This
statistical inference framework is also applicable for arbitrary compartment
contagion spreading processes on networks. We compare estimation accuracy of
these approaches in a number of computational experiments performed with the
SIR (susceptible-infected-recovered), SI (susceptible-infected) and ISS
(ignorant-spreading-stifler) contagion spreading models on synthetic and
real-world complex networks
Metals in Proteins: Correlation Between the Metal-Ion Type, Coordination Number and the Amino-Acid Residues Involved in the Coordination
Metal ions are constituents of many metalloproteins, in which they have either catalytic (metalloenzymes) or structural functions. In this work, the characteristics of various metals were studied (Cu, Zn, Mg, Mn, Fe, Co, Ni, Cd and Ca in proteins with known crystal structure) as well as the specificity of their environments. The analysis was performed on two data sets: the set of protein structures in the Protein Data Bank (PDB) determined with resolution < 1.5 angstrom and the set of nonredundant protein structures from the PDB. The former was used to determine the distances between each metal ion and its electron donors and the latter was used to assess the preferred coordination numbers and common combinations of amino-acid residues in the neighbourhood of each metal. Although the metal ions considered predominantly had a valence of two, their preferred coordination number and the type of amino-acid residues that participate in the coordination differed significantly from one metal ion to the next. This study concentrates on finding the specificities of a metal-ion environment, namely the distribution of coordination numbers and the amino-acid residue types that frequently take part in coordination. Furthermore, the correlation between the coordination number and the occurrence of certain amino-acid residues (quartets and triplets) in a metal-ion coordination sphere was analysed. The results obtained are of particular value for the identification and modelling of metal-binding sites in protein structures derived by homology modelling. Knowledge of the geometry and characteristics of the metal-binding sites in metalloproteins of known function can help to more closely determine the biological activity of proteins of unknown function and to aid in design of proteins with specific affinity for certain metals
Disentangling Sources of Influence in Online Social Networks
Information propagation in online social networks is facilitated by two types of influence - endogenous (peer) influence that acts between users of the social network and exogenous (external) that corresponds to various external mediators such as online news media. However, inference of these influences from data remains a challenge, especially when data on the activation of users is scarce. In this paper we propose a methodology that yields estimates of both endogenous and exogenous influence using only a social network structure and a single activation cascade. Our method exploits the statistical differences between the two types of influence - endogenous is dependent on the social network structure and current state of each user while exogenous is independent of these. We evaluate our methodology on simulated activation cascades as well as on cascades obtained from several large Facebook political survey applications. We show that our methodology is able to provide estimates of endogenous and exogenous influence in online social networks, characterize activation of each individual user as being endogenously or exogenously driven, and identify most influential groups of users
Direct identification of A-to-I editing sites with nanopore native RNA sequencing
Inosine is a prevalent RNA modification in animals and is formed when an adenosine is deaminated by the ADAR family of enzymes. Traditionally, inosines are identified indirectly as variants from Illumina RNA-sequencing data because they are interpreted as guanosines by cellular machineries. However, this indirect method performs poorly in protein-coding regions where exons are typically short, in non-model organisms with sparsely annotated single-nucleotide polymorphisms, or in disease contexts where unknown DNA mutations are pervasive. Here, we show that Oxford Nanopore direct RNA sequencing can be used to identify inosine-containing sites in native transcriptomes with high accuracy. We trained convolutional neural network models to distinguish inosine from adenosine and guanosine, and to estimate the modification rate at each editing site. Furthermore, we demonstrated their utility on the transcriptomes of human, mouse and Xenopus. Our approach expands the toolkit for studying adenosine-to-inosine editing and can be further extended to investigate other RNA modifications