Search CORE

29 research outputs found

Methods for estimating human endogenous retrovirus activities from EST databases-3

Author: Jaakko Peltonen (74250)
Jonas Blomberg (74251)
Merja Oja (74249)
Samuel Kaski (74252)
Publication venue
Publication date
Field of study

Copyright information:Taken from "Methods for estimating human endogenous retrovirus activities from EST databases"http://www.biomedcentral.com/1471-2105/8/S2/S11BMC Bioinformatics 2007;8(Suppl 2):S11-S11.Published online 3 May 2007PMCID:PMC1892069. are plotted separately on the left (random jitter has been added in the age direction). We can see that there is no clear correlation between estimated age and activity. There is a more detailed figure in the Additional file

The Francis Crick Institute

Methods for estimating human endogenous retrovirus activities from EST databases-2

Author: Jaakko Peltonen (74250)
Jonas Blomberg (74251)
Merja Oja (74249)
Samuel Kaski (74252)
Publication venue
Publication date
Field of study

Copyright information:Taken from "Methods for estimating human endogenous retrovirus activities from EST databases"http://www.biomedcentral.com/1471-2105/8/S2/S11BMC Bioinformatics 2007;8(Suppl 2):S11-S11.Published online 3 May 2007PMCID:PMC1892069. the curve presents EST hit intensity along the HERV structure. See Table 1 for more information on this HERV. EST hit areas for other highly active HERVs are shown in Supplementary Fig. 4 in Additional file

The Francis Crick Institute

Methods for estimating human endogenous retrovirus activities from EST databases-1

Author: Jaakko Peltonen (74250)
Jonas Blomberg (74251)
Merja Oja (74249)
Samuel Kaski (74252)
Publication venue
Publication date
Field of study

The Francis Crick Institute

Methods for estimating human endogenous retrovirus activities from EST databases-4

Author: Jaakko Peltonen (74250)
Jonas Blomberg (74251)
Merja Oja (74249)
Samuel Kaski (74252)
Publication venue
Publication date
Field of study

Copyright information:Taken from "Methods for estimating human endogenous retrovirus activities from EST databases"http://www.biomedcentral.com/1471-2105/8/S2/S11BMC Bioinformatics 2007;8(Suppl 2):S11-S11.Published online 3 May 2007PMCID:PMC1892069.t. The two darkest gray areas together show the proportion of active HERVs in that group, the lightest gray area shows the proportion of inactive HERVs. The widths of the bars are proportional to the size of the HERV group. We can see that the proportion of active and inactive HERVs varies a lot from group to group

The Francis Crick Institute

Network architecture for training the “natural protein” auxiliary task.

Author: Jason Weston (74291)
Merja Oja (74249)
William Stafford Noble (25740)
Yanjun Qi (51055)
Publication venue
Publication date
Field of study

The “natural protein” auxiliary task aiming to model the local patterns of amino acids that naturally occur in protein sequences. Using local windows in the unlabeled protein sequences as positive examples and randomly modified windows as negative examples, the network learns the feature representations for each amino acid. In contrast to the network illustrated in <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0032235#pone-0032235-g001" target="_blank">Figure 1</a>, the network contains only the amino acid embedding module in the first layer of the network. The learned embedding is encoded into the real valued parameter matrix of the amino acid feature extraction layer.</p

The Francis Crick Institute

Methods for estimating human endogenous retrovirus activities from EST databases-0

Author: Jaakko Peltonen (74250)
Jonas Blomberg (74251)
Merja Oja (74249)
Samuel Kaski (74252)
Publication venue
Publication date
Field of study

Copyright information:Taken from "Methods for estimating human endogenous retrovirus activities from EST databases"http://www.biomedcentral.com/1471-2105/8/S2/S11BMC Bioinformatics 2007;8(Suppl 2):S11-S11.Published online 3 May 2007PMCID:PMC1892069.ture shown in the middle. The shaded box is the basic block of the sub-HMM and is repeated length-2 times. It is identical in all sub-HMMs; only the emission distribution of the match state varies between blocks. The emission is either the nucleotide in that position of the HERV sequence or a mismatch. The probabilities for match and mismatch are equal for all blocks. The EEMIT-state emits the low-quality end part

The Francis Crick Institute

Methods for estimating human endogenous retrovirus activities from EST databases-5

Author: Jaakko Peltonen (74250)
Jonas Blomberg (74251)
Merja Oja (74249)
Samuel Kaski (74252)
Publication venue
Publication date
Field of study

Copyright information:Taken from "Methods for estimating human endogenous retrovirus activities from EST databases"http://www.biomedcentral.com/1471-2105/8/S2/S11BMC Bioinformatics 2007;8(Suppl 2):S11-S11.Published online 3 May 2007PMCID:PMC1892069. mixture model) and results from the complete set of 2450 HERVs (learned using the BLAST approach). The scale of the figure is such that the relative activities for the HERVs sum up to 1 in both x and y dimensions

The Francis Crick Institute

Comparison of learning strategies based on percent accuracy.

Author: Jason Weston (74291)
Merja Oja (74249)
William Stafford Noble (25740)
Yanjun Qi (51055)
Publication venue
Publication date
Field of study

The table lists, for each prediction task, the per-residue percent accuracy achieved via single-task training of the neural network with just the PSI-BLAST features (“Single”), single-task training that includes the amino acid embedding (“Embed”), multitask training just using the PSI-BLAST features (“Multi”), multitask training including the amino acid embedding (“Multi-Emb”), multitask training of one task along with the natural protein task (“NP”), multitask training without the PSI-BLAST embedding module but initializing the amino acid embedding by using the natural protein task (“NP only”), multitask training including the natural protein task (“All3”), “All3” with Viterbi post-processing (“All3+Vit”) and a previously reported method (“Previous”). Each row corresponds to a single task. The -value column indicates whether the difference between “Single” and “All3+Vit” is significant, according to a Z-test. The “CV” column is computed based on the accuracies separately for each cross-validation fold. It counts the percentage of CV folds in which the “All3+Vit” method outperforms the “Single” method. Rows labeled “(prot)” or “(seg)” report the protein- or segment-level accuracy, rather than residue-level accuracy. For the “NP” setting, the “*” in the “Embedding?” row indicates that this network uses the pre-trained embedding layer from the natural protein task.</p

The Francis Crick Institute

A learned amino acid embedding.

Author: Jason Weston (74291)
Merja Oja (74249)
William Stafford Noble (25740)
Yanjun Qi (51055)
Publication venue
Publication date
Field of study

The figure shows an approximation of a 15-dimensional embedding of amino acids, learned by a neural network trained on the natural protein task. The projection to 2D is accomplished via principal component analysis.</p

The Francis Crick Institute

Multitask learning with weight sharing between multiple deep neural networks.

Author: Jason Weston (74291)
Merja Oja (74249)
William Stafford Noble (25740)
Yanjun Qi (51055)
Publication venue
Publication date
Field of study

In this figure, two related tasks are trained simultaneously using the network the architecture from <a href="http://www.plosone.org/article/info:doi/10.1371/journal.pone.0032235#pone-0032235-g001" target="_blank">Figure 1</a>. Here only the very last layers of the network are task specific.</p

The Francis Crick Institute