52 research outputs found

    Relative predictability of individual U.S. Supreme Court justices.

    No full text
    <p>(A) Each line indicates the average relative predictability (that is, predictability according to the block model algorithm in the real court over the predictability in an equivalent ideal court) of a justice over their tenure, which is indicated by the length of the line. Red lines correspond to Republicanā€“nominated justices and blue lines to Democratā€“nominated justices. The background color indicates the party of the president. (B) Histogram of relative predictabilities. Bar colors indicate the fraction of Republicanā€“nominated (red) and Democratā€“nominated (blue) justices within each bin. (C) Relative predictability as a function of the nomination date of the judge. Relative predictability has significantly decreased during the period considered (pā€Š=ā€Š0.026, Spearman's rank correlation), as indicated by the dashed line (which is only shown as a guide to the eye).</p

    Relative court predictability.

    No full text
    <p>(A) The height of each bar indicates the average predictability of a court and its width the time span of the court. The color of the bar indicates the makeup of the court, with dark blue corresponding to a court with many Democratā€“nominated justices, and dark red to a court with many Republicanā€“nominated justices. (B) Cumulative distribution functions of the relative court predictability for courts that operated mostly under Democratic presidencies (blue) and Republican presidencies (red). The dashed lines indicate the means of the respective distributions.</p

    Court idealization.

    No full text
    <p>Each row represents the votes of the nine justices in a case (dark, agreement with the petitioner; bright, disagreement with the petitioner). We obtain the ideal court (right) from the real court (left) by randomly reshuffling, within each case, the votes of the justices so that the number of agreements and disagreements is preserved.</p

    Performance of drug interaction inference methods on an evolving database of major adverse drug interactions.

    No full text
    <p>Left: Drugs.com database; right: DrugBank dataset. (<b>Aā€“B</b>) Area under the receiver operating characteristic (AUROC) curve. For novel interactions the AUROC gives the probability that an interaction randomly chosen from those that were added to the first snapshot has a higher score than one randomly chosen from the set of interactions that were never added to the network. Similarly, for spurious interactions the AUROC gives the probability that an interaction randomly chosen from those that were removed from the first snapshot has a lower score than one randomly chosen from the set of interactions that were not removed from the network. (<b>Cā€“F</b>) Sensitivity-specificity curves for novel (<b>Cā€“D</b>) and spurious interactions (<b>Eā€“F</b>). Sensitivity is defined as the ratio of true positives to all real positives (true positives plus false negatives). Specificity is defined as the ratio of true negatives to all real negatives (true negatives plus false positives).</p

    Inference of drug interactions as part of the process of drug discovery and development.

    No full text
    <p>For each of the two drugs ((<b>A</b>) acetophenazine and (<b>B</b>) cinacalcet) we simulate an iterative process in which a plausible interaction is suggested by the stochastic block model inference approach, the interaction is tested, and information is added to the network of known drug-drug interactions. The graphs display the number of true interactions discovered as a function of the number of experiments carried out. Green dots represent true interactions, whereas red dots represent drugs that were suggested as interaction candidates but turned out not to interact with the target drug. For acetophenazine, the 16 iterations we carry out are enough to discover 11 of the 15 interactions that are reported in DrugBank. For cinacalcet, we are able to uncover 8 of the 12 reported interactions. The gray region indicates the feasible region of discovery. Its upper bound corresponds to discovering all interactions without ever testing a drug that does not interact with the target drug; the lower bound corresponds to randomly exploring all possible interactions. In the lower bound, it takes around 100 experiments to uncover each interaction.</p

    Stability of social signatures.

    No full text
    <p>(<b>A</b>) Distribution of the standardized Shannon entropy <i>S</i><sub><i>i</i></sub> (see text) for users in the period 2007ā€“2010. Entropy quantifies the extent to which and individualā€™s communication efforts are distributed among her contacts, so that <i>S</i><sub><i>i</i></sub> = 1 when user <i>i</i> exchanges the same number of emails with all her contacts and <i>S</i><sub><i>i</i></sub> ā‰ˆ 0 when she exchanges almost all of her emails with a single contact. Distributions for all years collapse onto a single curve. The line shows a kernel density estimation of the four yearly datasets pooled together. (<b>B</b>) Distributions of the change of individual standardized Shannon entropy Ī”<i>S</i><sub><i>i</i></sub>(Ī”<i>t</i>) = <i>S</i><sub><i>i</i></sub>(<i>t</i> + Ī”<i>t</i>) āˆ’ <i>S</i><sub><i>i</i></sub>(<i>t</i>), āˆ€<i>i</i> for Ī”<i>t</i> = 1, 2, 3 years (dots, squares and diamonds, respectively). The lines show the Laplace best fits based on BIC for the three distributions (Ī”<i>t</i> = 1 <i>Ļƒ</i> = 0.065; Ī”<i>t</i> = 2 <i>Ļƒ</i> = 0.075; and Ī”<i>t</i> = 3<i>Ļƒ</i> = 0.085). (<b>C</b>) Comparison between the absolute difference in individual social signatures |Ī”<i>S</i><sub><i>i</i></sub>(Ī”<i>t</i>)|<sub>self</sub> = |<i>S</i><sub><i>i</i></sub>(<i>t</i> + Ī”<i>t</i>) āˆ’ <i>S</i><sub><i>i</i></sub>(<i>t</i>)| and the typical absolute difference of entropies between individuals |Ī”<i>S</i><sub><i>ij</i></sub>|<sub>ref</sub> = |<i>S</i><sub><i>i</i></sub>(<i>t</i>) āˆ’ <i>S</i><sub><i>j</i></sub>(<i>t</i>)|. The boxplot shows unambiguously that users have stable social signatures.</p

    Long term email communication data within an organization

    No full text
    <p>Undirected email correspondence between users of a large organization with over 1,000 individuals for four consecutive years (2007-2010). For this period, we have information of the sender, the receiver and the total amount of emails sent within the organization using the corporate email address. To preserve users' privacy, individuals are completely anonymized and we do not have access to email content (see Ethics statement).</p> <p>The data is in the following format:</p> <p>user1ID user2ID #emails</p> <p>Where #emails is the total amount of emails exchanged (sent and received) in one natural year. The files are separated by years.</p> <p>Ethics statement:</p> <p>This data is exempt from IRB review because: i) The research involves the study of existing data--email logs from 2007 to 2010, which the IT service of the organization archived routinely, as mandated by law; ii) The information is recorded by the investigators in such a manner that subjects cannot be identified, directly or through identifiers linked to the subjects. Indeed, subjects were assigned a "hash" by the IT service prior to the start of our research, so that none of the investigators can link the "hash" back to the subject. We have no demographic information of any kind, so de-anonymization is also impossible.Ā </p

    Predictability of logarithmic growth rates for connection weight <i>r</i><sub><i>Ļ‰</i></sub>(<i>t</i> + 1) (A, C, E) and user strength <i>r</i><sub><i>s</i></sub>(<i>t</i> + 1) (B, D, F).

    No full text
    <p>(<b>A</b>) Joint probability density of <i>r</i><sub><i>Ļ‰</i></sub>(<i>t</i> + 1), the logarithmic growth rate of weights at time <i>t</i> + 1, and <i>r</i><sub><i>Ļ‰</i></sub>(<i>t</i>), the logarithmic growth rate of weights at time <i>t</i>. (<b>B</b>) Joint probability density of <i>r</i><sub><i>s</i></sub>(<i>t</i> + 1), the logarithmic growth rate of strengths at time <i>t</i> + 1, and <i>r</i><sub><i>s</i></sub>(<i>t</i>), the logarithmic growth rate of strengths at time <i>t</i>. (<b>C</b>) Joint probability density of <i>r</i><sub><i>Ļ‰</i></sub>(<i>t</i> + 1), the logarithmic growth rate of weights at time <i>t</i> + 1, and <i>Ļ‰</i>(<i>t</i>), the weight at time <i>t</i>. The area shaded in grey area is no allowed since <i>r</i><sub><i>Ļ‰</i></sub>(<i>t</i> + 1)ā‰„ āˆ’ log <i>Ļ‰</i>(<i>t</i>). (<b>D</b>) Joint probability density of <i>r</i><sub><i>s</i></sub>(<i>t</i> + 1), the logarithmic growth rate of strengths at time <i>t</i> + 1, and <i>s</i>(<i>t</i>), the strength at time <i>t</i>. The area shaded in grey is forbidden since <i>r</i><sub><i>s</i></sub>(<i>t</i> + 1)ā‰„ āˆ’ log <i>s</i>(<i>t</i>). In plots (<b>A</b>-<b>D</b>), circles and error bars show the mean and one standard error of the mean for values binned along the X axis. It is visually apparent that <i>Ļ‰</i>(<i>t</i>) and <i>s</i>(<i>t</i>) are more informative about <i>r</i><sub><i>Ļ‰</i></sub>(<i>t</i> + 1) and <i>r</i><sub><i>s</i></sub>(<i>t</i> + 1), respectively, than <i>r</i><sub><i>Ļ‰</i></sub>(<i>t</i>) and <i>r</i><sub><i>Ļ‰</i></sub>(<i>t</i>) (as confirmed by Spearmanā€™s <i>Ļ</i> and p-values, displayed inside each graph). (<b>E</b>, <b>F</b>) Root mean squared error (MSE) of the predictions of the logarithmic growth rates at time <i>t</i> + 1 obtained from leave-one-out experiments. As predictors, we use: (<b>E</b>) <i>Ļ‰</i>(<i>t</i>), <i>r</i><sub><i>Ļ‰</i></sub>(<i>t</i>), and <i>Ī¼</i><sub><i>Ļ‰</i></sub>(<i>t</i>) (see <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0146113#pone.0146113.e014" target="_blank">Eq (5)</a>); (<b>F</b>) <i>s</i>(<i>t</i>), <i>r</i><sub><i>s</i></sub>(<i>t</i>), and <i>Ī¼</i><sub><i>s</i></sub>(<i>t</i>) (see <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0146113#pone.0146113.e005" target="_blank">Eq (3)</a>). Additionally, in both cases we try to predict the logarithmic growth rate using a Random Forest regressor [<a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0146113#pone.0146113.ref029" target="_blank">29</a>]. Note that a simple approach (i.e. considering the weight/strength at time <i>t</i>) performs significantly better than a well-performing machine learning algorithm such as the Random Forest. In any case, and despite being the most predictive, weight/strength at time <i>t</i> only provide moderate improvements over predictions made using the mean value <i>Ī¼</i><sub><i>Ļ‰</i></sub> for all connections and <i>Ī¼</i><sub><i>s</i></sub> for all users.</p

    Stability of individual communication strategies.

    No full text
    <p>(<b>A</b>) Distribution of the fraction of emails sent by users to pre-existing contacts <i>f</i><sub><i>i</i></sub> (see text). The line shows the kernel density estimation of the three yearly datasets pooled together. Most users exchange most of their emails with preexisting contacts. with the maximum at <math><mrow><msubsup><mi>f</mi><mi>e</mi><mrow><mi>m</mi><mi>a</mi><mi>x</mi></mrow></msubsup><mo>=</mo><mn>0</mn><mo>.</mo><mn>90</mn></mrow></math>. (<b>B</b>) Distribution of the change of <i>f</i><sub><i>i</i></sub>, Ī”<i>f</i><sub><i>i</i></sub>(Ī”<i>t</i>) = <i>f</i><sub><i>i</i></sub>(<i>t</i> + Ī”<i>t</i>) āˆ’ <i>f</i><sub><i>i</i></sub>(<i>t</i>) for Ī”<i>t</i> = 1, 2 years (dots and squares, respectively). The lines show the Laplace best fits based on BIC for the two distributions (<i>P</i>(Ī”<i>f</i><sub><i>i</i></sub>)āˆ¼exp(āˆ’|Ī”<i>f</i><sub><i>i</i></sub> āˆ’ <i>Ī¼</i>|/<i>Ļƒ</i>); Ī”<i>t</i> = 1 <i>Ļƒ</i> = 0.18 <i>Ī¼</i> = 0.046; and Ī”<i>t</i> = 2 <i>Ļƒ</i> = 0.19 <i>Ī¼</i> = 0.062). Most of the users keep the number of emails sent to preexisting contacts constant in time, and the distributions are quite stable in time despite a slight shift towards larger changes for larger Ī”<i>t</i>. (<b>C</b>) Comparison between yearly absolute individual change in the fraction of emails sent to preexisting contacts |Ī”<i>f</i><sub><i>i</i></sub>(Ī”<i>t</i>)|<sub>self</sub> and the typical differences between users |Ī”<i>f</i><sub><i>ij</i></sub>|<sub>ref</sub> = |<i>f</i><sub><i>i</i></sub>(<i>t</i>) āˆ’ <i>f</i><sub><i>j</i></sub>(<i>t</i>)|, āˆ€<i>j</i> ā‰  <i>i</i>. The boxplot shows unambiguously that individual users have a stable communication strategy over time.</p

    Time evolution of nodesā€™ strengths.

    No full text
    <p>The strength <i>s</i><sub><i>i</i></sub> of node <i>i</i> is the number of emails that user <i>i</i> exchanged with other users during one year. (<b>A</b>) Distributions of strengths for each one of the years in our dataset (2007-2010). Note that the distribution is stable in time. (<b>B</b>) Distribution of centered strength logarithmic growth rates <math><mrow><msubsup><mi>r</mi><mi>s</mi><mn>0</mn></msubsup><mo>=</mo><mo>log</mo><mrow><mo>(</mo><mi>s</mi><mrow><mo>(</mo><mi>t</mi><mo>+</mo><mo>Ī”</mo><mi>t</mi><mo>)</mo></mrow><mo>)</mo></mrow><mo>-</mo><mo>log</mo><mrow><mo>(</mo><mi>s</mi><mrow><mo>(</mo><mi>t</mi><mo>)</mo></mrow><mo>)</mo></mrow><mo>-</mo><mi>Ī¼</mi><mrow><mo>(</mo><mi>t</mi><mo>,</mo><mo>Ī”</mo><mi>t</mi><mo>)</mo></mrow></mrow></math> for Ī”<i>t</i> = 1, 2, 3 years (dots, squares and diamonds, respectively). Lines show fits to a Laplace distribution (parameters Ī”<i>t</i> = 1: <i>Ļƒ</i><sub>exp</sub> = 0.57, Ī”<i>t</i> = 2: <i>Ļƒ</i><sub>exp</sub> = 0.74 and Ī”<i>t</i> = 3: <i>Ļƒ</i><sub>exp</sub> = 0.83). Note that as Ī”<i>t</i> increases the distributions are wider (see Fig D in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0146113#pone.0146113.s002" target="_blank">S2 File</a>). For the specific values of the distribution modes <i>Ī¼</i>(<i>t</i>, Ī”<i>t</i>) see Fig B in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0146113#pone.0146113.s002" target="_blank">S2 File</a>.</p
    • ā€¦
    corecore