Search CORE

28,327 research outputs found

Learning Character Strings via Mastermind Queries, with a Case Study Involving mtDNA

Author: Goodrich Michael T.
Publication venue: 'Institute of Electrical and Electronics Engineers (IEEE)'
Publication date: 13/04/2010
Field of study

We study the degree to which a character string,

Q

, leaks details about itself any time it engages in comparison protocols with a strings provided by a querier, Bob, even if those protocols are cryptographically guaranteed to produce no additional information other than the scores that assess the degree to which

Q

matches strings offered by Bob. We show that such scenarios allow Bob to play variants of the game of Mastermind with

Q

so as to learn the complete identity of

Q

. We show that there are a number of efficient implementations for Bob to employ in these Mastermind attacks, depending on knowledge he has about the structure of

Q

, which show how quickly he can determine

Q

. Indeed, we show that Bob can discover

Q

using a number of rounds of test comparisons that is much smaller than the length of

Q

, under reasonable assumptions regarding the types of scores that are returned by the cryptographic protocols and whether he can use knowledge about the distribution that

Q

comes from. We also provide the results of a case study we performed on a database of mitochondrial DNA, showing the vulnerability of existing real-world DNA data to the Mastermind attack.Comment: Full version of related paper appearing in IEEE Symposium on Security and Privacy 2009, "The Mastermind Attack on Genomic Data." This version corrects the proofs of what are now Theorems 2 and 4

arXiv.org e-Print Archive

Crossref

REPARATION : ribosome profiling assisted (re-)annotation of bacterial genomes

Author: Giess Adam
Jonckheere Veronique
Menschaert Gerben
Ndah Elvis
Valen Eivind
Van Damme Petra
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2017
Field of study

Prokaryotic genome annotation is highly dependent on automated methods, as manual curation cannot keep up with the exponential growth of sequenced genomes. Current automated methods depend heavily on sequence composition and often underestimate the complexity of the proteome. We developed RibosomeE Profiling Assisted (re-)AnnotaTION (REPARATION), a de novo machine learning algorithm that takes advantage of experimental protein synthesis evidence from ribosome profiling (Ribo-seq) to delineate translated open reading frames (ORFs) in bacteria, independent of genome annotation (https://github.com/Biobix/ REPARATION). REPARATION evaluates all possible ORFs in the genome and estimates minimum thresholds based on a growth curve model to screen for spurious ORFs. We applied REPARATION to three annotated bacterial species to obtain a more comprehensive mapping of their translation landscape in support of experimental data. In all cases, we identified hundreds of novel (small) ORFs including variants of previously annotated ORFs and >70% of all (variants of) annotated protein coding ORFs were predicted by REPARATION to be translated. Our predictions are supported by matching mass spectrometry proteomics data, sequence composition and conservation analysis. REPARATION is unique in that it makes use of experimental translation evidence to intrinsically perform a de novo ORF delineation in bacterial genomes irrespective of the sequence features linked to open reading frames

Ghent University Academic Bibliography

A computational framework for aesthetical navigation in musical search space

Author: Arshi Sahar
Davis Darryl
Publication venue
Publication date
Field of study

Paper presented at 3rd AISB symposium on computational creativity, AISB 2016, 4-6th April, Sheffield. Abstract. This article addresses aspects of an ongoing project in the generation of artificial Persian (-like) music. Liquid Persian Music software (LPM) is a cellular automata based audio generator. In this paper LPM is discussed from the view point of future potentials of algorithmic composition and creativity. Liquid Persian Music is a creative tool, enabling exploration of emergent audio through new dimensions of music composition. Various configurations of the system produce different voices which resemble musical motives in many respects. Aesthetical measurements are determined by Zipf’s law in an evolutionary environment. Arranging these voices together for producing a musical corpus can be considered as a search problem in the LPM outputs space of musical possibilities. On this account, the issues toward defining the search space for LPM is studied throughout this paper

Repository@Hull - Worktribe

In silico karyotyping of chromosomally polymorphic malaria mosquitoes in the Anopheles gambiae complex

Author: Besansky N. J.
Caputo B.
Della Torre A.
Love R. R.
Petrarca V.
Pombi M.
Redmond S. N.
The Anopheles gambiae1000 Genomes Consortium
Publication venue: 'Genetics Society of America'
Publication date: 01/01/2019
Field of study

Chromosomal inversion polymorphisms play an important role in adaptation to environmental heterogeneities. For mosquito species in the Anopheles gambiae complex that are significant vectors of human malaria, paracentric inversion polymorphisms are abundant and are associated with ecologically and epidemiologically important phenotypes. Improved understanding of these traits relies on determining mosquito karyotype, which currently depends upon laborious cytogenetic methods whose application is limited both by the requirement for specialized expertise and for properly preserved adult females at specific gonotrophic stages. To overcome this limitation, we developed sets of tag single nucleotide polymorphisms (SNPs) inside inversions whose biallelic genotype is strongly correlated with inversion genotype. We leveraged 1,347 fully sequenced An. gambiae and Anopheles coluzzii genomes in the Ag1000G database of natural variation. Beginning with principal components analysis (PCA) of population samples, applied to windows of the genome containing individual chromosomal rearrangements, we classified samples into three inversion genotypes, distinguishing homozygous inverted and homozygous uninverted groups by inclusion of the small subset of specimens in Ag1000G that are associated with cytogenetic metadata. We then assessed the correlation between candidate tag SNP genotypes and PCA-based inversion genotypes in our training sets, selecting those candidates with >80% agreement. Our initial tests both in held-back validation samples from Ag1000G and in data independent of Ag1000G suggest that when used for in silico inversion genotyping of sequenced mosquitoes, these tags perform better than traditional cytogenetics, even for specimens where only a small subset of the tag SNPs can be successfully ascertained

Archivio della ricerca- Università di Roma La Sapienza

Joint assembly and genetic mapping of the Atlantic horseshoe crab genome reveals ancient whole genome duplication

Author: Brockmann H Jane
Havlak Paul
Lv Jie
Nossa Carlos
Putnam Nicholas H
Vincent Kim
Yue Jia-Xing
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 27/09/2013
Field of study

Horseshoe crabs are marine arthropods with a fossil record extending back approximately 450 million years. They exhibit remarkable morphological stability over their long evolutionary history, retaining a number of ancestral arthropod traits, and are often cited as examples of "living fossils." As arthropods, they belong to the Ecdysozoa}, an ancient super-phylum whose sequenced genomes (including insects and nematodes) have thus far shown more divergence from the ancestral pattern of eumetazoan genome organization than cnidarians, deuterostomes, and lophotrochozoans. However, much of ecdysozoan diversity remains unrepresented in comparative genomic analyses. Here we use a new strategy of combined de novo assembly and genetic mapping to examine the chromosome-scale genome organization of the Atlantic horseshoe crab Limulus polyphemus. We constructed a genetic linkage map of this 2.7 Gbp genome by sequencing the nuclear DNA of 34 wild-collected, full-sibling embryos and their parents at a mean redundancy of 1.1x per sample. The map includes 84,307 sequence markers and 5,775 candidate conserved protein coding genes. Comparison to other metazoan genomes shows that the L. polyphemus genome preserves ancestral bilaterian linkage groups, and that a common ancestor of modern horseshoe crabs underwent one or more ancient whole genome duplications (WGDs) ~ 300 MYA, followed by extensive chromosome fusion

arXiv.org e-Print Archive

Springer - Publisher Connector

PubMed Central

DSpace at Rice University

Balancing Selection at the Tomato RCR3 Guardee Gene Family Maintains Variation in Strength of Pathogen Defense

Author: A Tellier
AL Caicedo
Anja C. Hörger
Aurélien Tellier
BBH Wulff
C Zipfel
D Greenbaum
D Tian
E Baudry
EA Kerr
EA Kerr
EA Stahl
EA van der Biezen
EB Holub
EG Bakker
F Kaschani
F Kaschani
F Tajima
H Innan
H Innan
HCE Rooney
HH Flor
HP van Esse
I Ispolatov
IE Peralta
J Bergelson
J Krüger
J Song
JA Labate
JDG Jones
JKM Brown
JL Dangl
JN Thompson
JV Chamary
K Bomblies
K Bomblies
K Roselius
KE Hammond-Kosack
KR Thornton
KR Young
KS Caldwell
L Excoffier
Laura E. Rose
LE Rose
LE Rose
LE Rose
LE Rose
M Kruijt
M Shabab
MA Beaumont
MA Beaumont
MS Dixon
MT Nishimura
Muhammad Ilyas
MY Tian
P Librado
P Schmid-Hempel
PD Spanu
PN Dodds
R Dawkins
RAL van der Hoorn
RAL van der Hoorn
Renier A. L. van der Hoorn
Rodney Mauricio
RW Michelmore
S Raffaele
ST Chisholm
T Nakazato
T Nurnberger
T Städler
TET Bond
U Arunyawat
Wolfgang Stephan
Y Jia
YG Liu
Publication venue: 'Public Library of Science (PLoS)'
Publication date: 01/01/2012
Field of study

Coevolution between hosts and pathogens is thought to occur between interacting molecules of both species. This results in the maintenance of genetic diversity at pathogen antigens (or so-called effectors) and host resistance genes such as the major histocompatibility complex (MHC) in mammals or resistance (R) genes in plants. In plant-pathogen interactions, the current paradigm posits that a specific defense response is activated upon recognition of pathogen effectors via interaction with their corresponding R proteins. According to the''Guard-Hypothesis,'' R proteins (the ``guards'') can sense modification of target molecules in the host (the ``guardees'') by pathogen effectors and subsequently trigger the defense response. Multiple studies have reported high genetic diversity at R genes maintained by balancing selection. In contrast, little is known about the evolutionary mechanisms shaping the guardee, which may be subject to contrasting evolutionary forces. Here we show that the evolution of the guardee RCR3 is characterized by gene duplication, frequent gene conversion, and balancing selection in the wild tomato species Solanum peruvianum. Investigating the functional characteristics of 54 natural variants through in vitro and in planta assays, we detected differences in recognition of the pathogen effector through interaction with the guardee, as well as substantial variation in the strength of the defense response. This variation is maintained by balancing selection at each copy of the RCR3 gene. Our analyses pinpoint three amino acid polymorphisms with key functional consequences for the coevolution between the guardee (RCR3) and its guard (Cf-2). We conclude that, in addition to coevolution at the ``guardee-effector'' interface for pathogen recognition, natural selection acts on the ``guard-guardee'' interface. Guardee evolution may be governed by a counterbalance between improved activation in the presence and prevention of auto-immune responses in the absence of the corresponding pathogen

CiteSeerX

Crossref

Directory of Open Access Journals

Open Access LMU

PubMed Central

Oxford University Research Archive

MPG.PuRe

FigShare

ISOWN: accurate somatic mutation identification in the absence of normal tissue controls.

Author: Bartlett John MS
Kalatskaya Irina
McPherson John D
Spears Melanie
Stein Lincoln
Trinh Quang M
Publication venue: eScholarship, University of California
Publication date: 01/06/2017
Field of study

BackgroundA key step in cancer genome analysis is the identification of somatic mutations in the tumor. This is typically done by comparing the genome of the tumor to the reference genome sequence derived from a normal tissue taken from the same donor. However, there are a variety of common scenarios in which matched normal tissue is not available for comparison.ResultsIn this work, we describe an algorithm to distinguish somatic single nucleotide variants (SNVs) in next-generation sequencing data from germline polymorphisms in the absence of normal samples using a machine learning approach. Our algorithm was evaluated using a family of supervised learning classifications across six different cancer types and ~1600 samples, including cell lines, fresh frozen tissues, and formalin-fixed paraffin-embedded tissues; we tested our algorithm with both deep targeted and whole-exome sequencing data. Our algorithm correctly classified between 95 and 98% of somatic mutations with F1-measure ranges from 75.9 to 98.6% depending on the tumor type. We have released the algorithm as a software package called ISOWN (Identification of SOmatic mutations Without matching Normal tissues).ConclusionsIn this work, we describe the development, implementation, and validation of ISOWN, an accurate algorithm for predicting somatic mutations in cancer tissues in the absence of matching normal tissues. ISOWN is available as Open Source under Apache License 2.0 from https://github.com/ikalatskaya/ISOWN

University of Toronto Research Repository

Crossref

Directory of Open Access Journals

eScholarship - University of California

BRUCE: a program for the detection of transfer-messenger RNA genes in nucleotide sequences

Author: Andersson S.
Canback B.
Laslett D.
Publication venue: 'Oxford University Press (OUP)'
Publication date: 01/01/2002
Field of study

A computer program, BRUCE, was developed for the identification of transfer‐messenger RNA (tmRNA) genes. The program employs heuristic algorithms to search for a tRNAAla‐like secondary structure surrounding a short sequence encoding the tag peptide. In the 57 completely sequenced bacterial genomes where tmRNA genes have been reported previously, BRUCE identified all with no false positives. In addition, BRUCE found 99 of the 100 tmRNAs identified previously in other bacteria, red chloroplasts and cyanelles. The output of the program reports the proposed tRNA secondary structure, the tmRNA gene sequence and the tag peptide

PubMed Central

Research Repository