Search CORE

253 research outputs found

Approximating Clustering of Fingerprint Vectors with Missing Values

Author: A. Figueroa
C.H. Papadimitriou
G. Ausiello
Giancarlo Mauri
Gianluca Della Vedova
L. Valinsky
L. Valinsky
M. Chlebík
P. Alimonti
Paola Bonizzoni
R. Drmanac
Riccardo Dondi
S. Drmanac
S. Drmanac
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 23/11/2005
Field of study

The problem of clustering fingerprint vectors is an interesting problem in Computational Biology that has been proposed in (Figureroa et al. 2004). In this paper we show some improvements in closing the gaps between the known lower bounds and upper bounds on the approximability of some variants of the biological problem. Namely we are able to prove that the problem is APX-hard even when each fingerprint contains only two unknown position. Moreover we have studied some variants of the orginal problem, and we give two 2-approximation algorithm for the IECMV and OECMV problems when the number of unknown entries for each vector is at most a constant.Comment: 13 pages, 4 figure

arXiv.org e-Print Archive

Crossref

Recommended from our members

Coupling sequencing by hybridization (SBH) with gel sequencing for an inexpensive analysis of genes and genomes

Author: Drmanac R.
Drmanac S.
Hauser B.
Labat I.
Publication venue: 'Office of Scientific and Technical Information (OSTI)'
Publication date: 01/11/1996
Field of study

The speed and cost of DNA sequencing are bottlenecks in the analysis of genes end genomes. Sequencing by hybridization (SBH) is a versatile method with several applications which can accelerated DNA screening, mapping and sequencing. Requirements, achievements and problems in the development of the SBH format 1 (DNA samples arrayed) are presented and schemes for its synergetic coupling with gel sequencing techniques are discussed. It appears that by one hybridization machine with 24 boxes and four ABI gel sequencers 100- 300 Mb of DNA sequence can be determined per year. Various genetic studies based on computer assisted analysis of large collections of partial or complete DNA sequences (`sequenetics`) may be achieved in this century

UNT Digital Library

Identification of FVIII gene mutations in patients with hemophilia A using new combinatorial sequencing by hybridization

Author: Chetta M
Drmanac A
Fortina P
Grandone E
Margaglione M
Santacroce R
Surrey S
Publication venue: Medknow Publications on behalf of Indian Society of Human Genetics
Publication date: 09/03/2009
Field of study

Background: Standard methods of mutation detection are time consuming in Hemophilia A (HA) rendering their application unavailable in some analysis such as prenatal diagnosis. Objectives: To evaluate the feasibility of combinatorial sequencing-by-hybridization (cSBH) as an alternative and reliable tool for mutation detection in FVIII gene. Patients/Methods: We have applied a new method of cSBH that uses two different colors for detection of multiple point mutations in the FVIII gene. The 26 exons encompassing the HA gene were analyzed in 7 newly diagnosed Italian patients and in 19 previously characterized individuals with FVIII deficiency. Results: Data show that, when solution-phase TAMRA and QUASAR labeled 5-mer oligonucleotide sets mixed with unlabeled target PCR templates are co-hybridized in the presence of DNA ligase to universal 6-mer oligonucleotide probe-based arrays, a number of mutations can be successfully detected. The technique was reliable also in identifying a mutant FVIII allele in an obligate heterozygote. A novel missense mutation (Leu1843Thr) in exon 16 and three novel neutral polymorphisms are presented with an updated protocol for 2-color cSBH. Conclusions: cSBH is a reliable tool for mutation detection in FVIII gene and may represent a complementary method for the genetic screening of HA patients

Bioline International

Routes for breaching and protecting genetic privacy

Author: A Acquisti
A Cavoukian
A Kong
A Machanavajjhala
A Narayanan
AD Johnson
AJ Pakstis
AK Manning
AL McGuire
Arvind Narayanan
B Fons
B Malin
B Malin
BA Malin
BM Henn
C Dwork
C Shannon
CD Huff
D Clayton
D He
D Zubakov
DJ Solve
DR Nyholt
DW Craig
EA Zerhouni
EE Schadt
EM Ramos
F Liu
G Church
H Lango Allen
H Li
HK Im
HS Venter
J Burn
J Gitschier
J Kaiser
J Kaye
J Kaye
J Lee
J Marchini
JE Lunshof
JH Park
JM Oliver
JP Roberts
K Benitez
K El Emam
K El Emam
K Silventoinen
KA Tryka
KB Jacobs
KS Kendler
L Kamm
L Sweeney
L Sweeney
LA Sweeney
LA Sweeney
LAP Kohn
LL Rodriguez
M Canim
M Gymrek
M Gymrek
M Kantarcioglu
M Kayser
MD Mailman
N Chatterjee
N Homer
NN Taleb
P Bohannon
P Kwok
P Ohm
P Paillier
PM Visscher
R Braun
R Drmanac
R Khan
R Noumeir
RL Bennett
S Byers
S McClure
S Sankararaman
S Walsh
SE Brenner
SF Terry
SH Friend
T Lumley
TE King
TE King
V Bafna
W Fu
W Hartzog
WG Hill
WW Lowrance
XL Ou
Yaniv Erlich
Z Lin
Publication venue
Publication date: 01/12/2013
Field of study

We are entering the era of ubiquitous genetic information for research, clinical care, and personal curiosity. Sharing these datasets is vital for rapid progress in understanding the genetic basis of human diseases. However, one growing concern is the ability to protect the genetic privacy of the data originators. Here, we technically map threats to genetic privacy and discuss potential mitigation strategies for privacy-preserving dissemination of genetic data.Comment: Draft for comment

arXiv.org e-Print Archive

Princeton University Open Access Repository

Crossref

PubMed Central

A population-specific reference panel empowers genetic studies of Anabaptist populations

Author: A Kong
A Manichaikul
A Tan
B Georgi
BL Browning
C Genomes Project
C Sidore
CV Hout Van
EG Puffenberger
ET Lim
G Glusman
G Pistis
J Marchini
J O’Connell
K Hatzikotoulas
K Kristiansson
K Wang
KA Strauss
L Hou
L Peltonen
L Peltonen
M Arcos-Burgos
M Kircher
M Lek
MH Crawford
MJ Khoury
MP Conomos
MS McPeek
O Delaneau
P Carnevali
Q Duan
R Drmanac
R Durbin
S Carmi
S McCarthy
VA McKusick
Publication venue: 'Springer Science and Business Media LLC'
Publication date: 20/07/2017
Field of study

Genotype imputation is a powerful strategy for achieving the large sample sizes required for identification of variants underlying complex phenotypes, but imputation of rare variants remains problematic. Genetically isolated populations offer one solution, however population-specific reference panels are needed to assure optimal imputation accuracy and allele frequency estimation. Here we report the Anabaptist Genome Reference Panel (AGRP), the first whole-genome catalogue of variants and phased haplotypes in people of Amish and Mennonite ancestry. Based on high-depth whole-genome sequence (WGS) from 265 individuals, the AGRP contains >12 M high-confidence single nucleotide variants and short indels, of which ~12.5% are novel. These Anabaptist-specific variants were more deleterious than variants with comparable frequencies observed in the 1000 Genomes panel. About 43,000 variants showed enriched allele frequencies in AGRP, consistent with drift. When combined with the 1000 Genomes Project reference panel, the AGRP substantially improved imputation, especially for rarer variants. The AGRP is freely available to researchers through an imputation server

Crossref

KU ScholarWorks

University of Miami: Scholarship Miami

Providence St. Joseph Health Digital Commons

Detection and phasing of single base de novo mutations in biopsies from human in vitro fertilized embryos by advanced whole-genome sequencing

Author: Agarwal M.
Alferov O.
Berkeley A.
Crain B.
Drmanac R.
Gulbahce N.
Hayden D.
Kermani B.
McElwain M.
Munné S.
Peters B.
Prates R.
Tang Y.
Tearle R.
Zhang R.
Publication venue: 'Cold Spring Harbor Laboratory'
Publication date: 01/01/2015
Field of study

Currently, the methods available for preimplantation genetic diagnosis (PGD) of in vitro fertilized (IVF) embryos do not detect de novo single-nucleotide and short indel mutations, which have been shown to cause a large fraction of genetic diseases. Detection of all these types of mutations requires whole-genome sequencing (WGS). In this study, advanced massively parallel WGS was performed on three 5- to 10-cell biopsies from two blastocyst-stage embryos. Both parents and paternal grandparents were also analyzed to allow for accurate measurements of false-positive and false-negative error rates. Overall, >95% of each genome was called. In the embryos, experimentally derived haplotypes and barcoded read data were used to detect and phase up to 82% of de novo single base mutations with a false-positive rate of about one error per Gb, resulting in fewer than 10 such errors per embryo. This represents a ∼ 100-fold lower error rate than previously published from 10 cells, and it is the first demonstration that advanced WGS can be used to accurately identify these de novo mutations in spite of the thousands of false-positive errors introduced by the extensive DNA amplification required for deep sequencing. Using haplotype information, we also demonstrate how small de novo deletions could be detected. These results suggest that phased WGS using barcoded DNA could be used in the future as part of the PGD process to maximize comprehensiveness in detecting disease-causing mutations and to reduce the incidence of genetic diseases.Brock A. Peters, Bahram G. Kermani, Oleg Alferov, Misha R. Agarwal, Mark A. McElwain, Natali Gulbahce, Daniel M. Hayden, Y. Tom Tang, Rebecca Yu Zhang, Rick Tearle, Birgit Crain, Renata Prates, Alan Berkeley, Santiago Munné and Radoje Drmana

Crossref

Adelaide Research & Scholarship

PubMed Central

Detecting Past Positive Selection through Ongoing Negative Selection

Author: Ahn
Alexey S. Kondrashov
Andrews
Asthana
Baxevanis
Bazykin
Bazykin
Benson
Bentley
Charlesworth
Charlesworth
Drmanac
Eyre-Walker
Eyre-Walker
Eyre-Walker
Georgii A. Bazykin
Grossman
Heger
Hodgkinson
Hsu
Huelsenbeck
Jordan
Keightley
Kimura
Krause
Kryazhimskiy
Kryazhimskiy
Kuhn
Levy
McDonald
Mustonen
Nielsen
Novembre
Popadin
Ruff
Schuster
Smith
Taylor
Thompson
Tweedie
Wang
Watterson
Yang
Yang
Publication venue: Oxford University Press
Publication date
Field of study

Detecting positive selection is a challenging task. We propose a method for detecting past positive selection through ongoing negative selection, based on comparison of the parameters of intraspecies polymorphism at functionally important and selectively neutral sites where a nucleotide substitution of the same kind occurred recently. Reduced occurrence of recently replaced ancestral alleles at functionally important sites indicates that negative selection currently acts against these alleles and, therefore, that their replacements were driven by positive selection. Application of this method to the Drosophila melanogaster lineage shows that the fraction of adaptive amino acid replacements remained approximately 0.5 for a long time. In the Homo sapiens lineage, however, this fraction drops from approximately 0.5 before the Ponginae–Homininae divergence to approximately 0 after it. The proposed method is based on essentially the same data as the McDonald–Kreitman test but is free from some of its limitations, which may open new opportunities, especially when many genotypes within a species are known

Crossref

PubMed Central

cPAS-based sequencing on the BGISEQ-500 to explore small non-coding RNAs

Author: A Keller
A Keller
A Keller
B Canard
B Langmead
B Meder
C Backes
C Backes
C Backes
D Stockel
D Veneziano
EL Dijk van
G Ruvkun
KL Burgos
M Hafner
M Hamberg
MR Friedlander
MR Friedlander
P Leidinger
P Mestdagh
R Drmanac
R Drmanc
RC Lee
S Griffiths-Jones
X Huang
Y Zhao
Publication venue: 'Springer Science and Business Media LLC'
Publication date
Field of study

Crossref

Manifold Learning for Human Population Structure Studies

Author: A Chakravarti
AB Lee
AB Lee
AJ Izenman
AL Price
B Li
BM Henn
C Deng
DR Bentley
E Kosman
F Collins
GV Kryukov
Hoicheong Siu
J Friedman
J Novembre
J Shendure
J Tenenbaum
J Zhang
J Zhang
J Zhang
JC Venter
JE Pool
L Cavalli-Sforza
Li Jin
LK Saul
M Belkin
ML Metzker
Momiao Xiong
N Kambhatla
N Patterson
P Menozzi
P Paschou
PE Smouse
R Drmanac
R Nielsen
R Wang
S Biswas
S Roweis
S Yan
SY Kim
T Tibshirani
W Guan
W Zhang
Yun Li
Publication venue: Public Library of Science
Publication date: 01/01/2012
Field of study

The dimension of the population genetics data produced by next-generation sequencing platforms is extremely high. However, the “intrinsic dimensionality” of sequence data, which determines the structure of populations, is much lower. This motivates us to use locally linear embedding (LLE) which projects high dimensional genomic data into low dimensional, neighborhood preserving embedding, as a general framework for population structure and historical inference. To facilitate application of the LLE to population genetic analysis, we systematically investigate several important properties of the LLE and reveal the connection between the LLE and principal component analysis (PCA). Identifying a set of markers and genomic regions which could be used for population structure analysis will provide invaluable information for population genetics and association studies. In addition to identifying the LLE-correlated or PCA-correlated structure informative marker, we have developed a new statistic that integrates genomic information content in a genomic region for collectively studying its association with the population structure and LASSO algorithm to search such regions across the genomes. We applied the developed methodologies to a low coverage pilot dataset in the 1000 Genomes Project and a PHASE III Mexico dataset of the HapMap. We observed that 25.1%, 44.9% and 21.4% of the common variants and 89.2%, 92.4% and 75.1% of the rare variants were the LLE-correlated markers in CEU, YRI and ASI, respectively. This showed that rare variants, which are often private to specific populations, have much higher power to identify population substructure than common variants. The preliminary results demonstrated that next generation sequencing offers a rich resources and LLE provide a powerful tool for population structure analysis

CiteSeerX

Public Library of Science (PLOS)

Crossref

Directory of Open Access Journals

PubMed Central

FigShare

Targeted resequencing of candidate genes using selector probes

Author: Albert
Bentley
Burbano
Dahl
Dahl
Dean
Drmanac
E. Falk Sörqvist
F. Roos
Frazer
Gnirke
H. Göransson Kultima
H. Johansson
Han
Hodges
J. Botling
J. Stenberg
K. Edlund
Landegren
M. Isaksson
Mamanova
Margulies
Mats Nilsson
Meuzelaar
Olle Ericsson
Olshen
P. Micke
Porreca
S. Fredriksson
Saiki
Sanger
Sjöblom
Smith
Stenberg
Stenberg
Stenberg
Summerer
T. Sjöblom
Tewhey
Turner
Turner
Varley
Weinstein
Publication venue: Oxford University Press
Publication date
Field of study

Targeted genome enrichment is a powerful tool for making use of the massive throughput of novel DNA-sequencing instruments. We herein present a simple and scalable protocol for multiplex amplification of target regions based on the Selector technique. The updated version exhibits improved coverage and compatibility with next-generation-sequencing (NGS) library-construction procedures for shotgun sequencing with NGS platforms. To demonstrate the performance of the technique, all 501 exons from 28 genes frequently involved in cancer were enriched for and sequenced in specimens derived from cell lines and tumor biopsies. DNA from both fresh frozen and formalin-fixed paraffin-embedded biopsies were analyzed and 94% specificity and 98% coverage of the targeted region was achieved. Reproducibility between replicates was high (R2 = 0, 98) and readily enabled detection of copy-number variations. The procedure can be carried out in <24 h and does not require any dedicated instrumentation

Crossref

PubMed Central