316 research outputs found
Recommended from our members
Handling the data management needs of high-throughput sequencing data: SpeedGene, a compression algorithm for the efficient storage of genetic data
Background: As Next-Generation Sequencing data becomes available, existing hardware environments do not provide sufficient storage space and computational power to store and process the data due to their enormous size. This is and will be a frequent problem that is encountered everyday by researchers who are working on genetic data. There are some options available for compressing and storing such data, such as general-purpose compression software, PBAT/PLINK binary format, etc. However, these currently available methods either do not offer sufficient compression rates, or require a great amount of CPU time for decompression and loading every time the data is accessed. Results: Here, we propose a novel and simple algorithm for storing such sequencing data. We show that, the compression factor of the algorithm ranges from 16 to several hundreds, which potentially allows SNP data of hundreds of Gigabytes to be stored in hundreds of Megabytes. We provide a C++ implementation of the algorithm, which supports direct loading and parallel loading of the compressed format without requiring extra time for decompression. By applying the algorithm to simulated and real datasets, we show that the algorithm gives greater compression rate than the commonly used compression methods, and the data-loading process takes less time. Also, The C++ library provides direct-data-retrieving functions, which allows the compressed information to be easily accessed by other C++ programs. Conclusions: The SpeedGene algorithm enables the storage and the analysis of next generation sequencing data in current hardware environment, making system upgrades unnecessary
A general semi-parametric approach to the analysis of genetic association studies in population-based designs
Background: For genetic association studies in designs of unrelated individuals, current statistical methodology typically models the phenotype of interest as a function of the genotype and assumes a known statistical model for the phenotype. In the analysis of complex phenotypes, especially in the presence of ascertainment conditions, the specification of such model assumptions is not straight-forward and is error-prone, potentially causing misleading results. Results: In this paper, we propose an alternative approach that treats the genotype as the random variable and conditions upon the phenotype. Thereby, the validity of the approach does not depend on the correctness of assumptions about the phenotypic model. Misspecification of the phenotypic model may lead to reduced statistical power. Theoretical derivations and simulation studies demonstrate both the validity and the advantages of the approach over existing methodology. In the COPDGene study (a GWAS for Chronic Obstructive Pulmonary Disease (COPD)), we apply the approach to a secondary, quantitative phenotype, the Fagerstrom nicotine dependence score, that is correlated with COPD affection status. The software package that implements this method is available. Conclusions: The flexibility of this approach enables the straight-forward application to quantitative phenotypes and binary traits in ascertained and unascertained samples. In addition to its robustness features, our method provides the platform for the construction of complex statistical models for longitudinal data, multivariate data, multi-marker tests, rare-variant analysis, and others
BICEP: a large angular scale CMB polarimeter
We describe the design and expected performance of BICEP, a millimeter wave receiver designed to measure the polarization of the cosmic microwave background. BICEP uses an array of polarization sensitive bolometers operating at 100 and 150 GHz to measure polarized signals over a 20 degree field of view with 1 degree resolution. BICEP is designed with particular attention to systematic effects which can potentially degrade the polarimetric fidelity of the observations. BICEP is optimized to detect the faint signature of a primordial gravitational wave background which is a generic prediction of inflationary cosmologies
Interpreting ciliopathy-associated missense variants of uncertain significance (VUS) in Caenorhabditis elegans
Better methods are required to interpret the pathogenicity of disease-associated variants of uncertain significance (VUS), which cannot be actioned clinically. In this study, we explore the use of an animal model (Caenorhabditis elegans) for in vivo interpretation of missense VUS alleles of TMEM67, a cilia gene associated with ciliopathies. CRISPR/Cas9 gene editing was used to generate homozygous knock-in C. elegans worm strains carrying TMEM67 patient variants engineered into the orthologous gene (mks-3). Quantitative phenotypic assays of sensory cilia structure and function (neuronal dye filling, roaming and chemotaxis assays) measured how the variants impacted mks-3 gene function. Effects of the variants on mks-3 function were further investigated by looking at MKS-3::GFP localization and cilia ultrastructure. The quantitative assays in C. elegans accurately distinguished between known benign (Asp359Glu, Thr360Ala) and known pathogenic (Glu361Ter, Gln376Pro) variants. Analysis of eight missense VUS generated evidence that three are benign (Cys173Arg, Thr176Ile and Gly979Arg) and five are pathogenic (Cys170Tyr, His782Arg, Gly786Glu, His790Arg and Ser961Tyr). Results from worms were validated by a genetic complementation assay in a human TMEM67 knock-out hTERT-RPE1 cell line that tests a TMEM67 signalling function. We conclude that efficient genome editing and quantitative functional assays in C. elegans make it a tractable in vivo animal model for rapid, cost-effective interpretation of ciliopathy-associated missense VUS alleles
CMB polarimetry with BICEP: instrument characterization, calibration, and performance
BICEP is a ground-based millimeter-wave bolometric array designed to target
the primordial gravity wave signature on the polarization of the cosmic
microwave background (CMB) at degree angular scales. Currently in its third
year of operation at the South Pole, BICEP is measuring the CMB polarization
with unprecedented sensitivity at 100 and 150 GHz in the cleanest available 2%
of the sky, as well as deriving independent constraints on the diffuse
polarized foregrounds with select observations on and off the Galactic plane.
Instrument calibrations are discussed in the context of rigorous control of
systematic errors, and the performance during the first two years of the
experiment is reviewed.Comment: 12 pages, 15 figures, updated version of a paper accepted for
Millimeter and Submillimeter Detectors and Instrumentation for Astronomy IV,
Proceedings of SPIE, 7020, 200
Absolute polarization angle calibration using polarized diffuse Galactic emission observed by BICEP
We present a method of cross-calibrating the polarization angle of a
polarimeter using BICEP Galactic observations. \bicep\ was a ground based
experiment using an array of 49 pairs of polarization sensitive bolometers
observing from the geographic South Pole at 100 and 150 GHz. The BICEP
polarimeter is calibrated to +/-0.01 in cross-polarization and less than +/-0.7
degrees in absolute polarization orientation. BICEP observed the temperature
and polarization of the Galactic plane (R.A= 100 degrees ~ 270 degrees and Dec.
= -67 degrees ~ -48 degrees). We show that the statistical error in the 100 GHz
BICEP Galaxy map can constrain the polarization angle offset of WMAP Wband to
0.6 degrees +\- 1.4 degrees. The expected 1 sigma errors on the polarization
angle cross-calibration for Planck or EPIC are 1.3 degrees and 0.3 degrees at
100 and 150 GHz, respectively. We also discuss the expected improvement of the
BICEP Galactic field observations with forthcoming BICEP2 and Keck
observations.Comment: 13 pages, 10 figures and 2 tables. To appear in Proceedings of SPIE
Astronomical Telescopes and Instrumentation 201
Simultaneous measurements of aerosol size distributions at three sites in the European high Arctic
19 pages, 9 figures, 1 tableAerosols are an integral part of the Arctic climate system due to their direct interaction with radiation and indirect interaction through cloud formation. Understanding aerosol size distributions and their dynamics is crucial for the ability to predict these climate relevant effects. When of favourable size and composition, both long-rangetransported-and locally formed particles-may serve as cloud condensation nuclei (CCN). Small changes of composition or size may have a large impact on the low CCN concentrations currently characteristic of the Arctic environment. We present a cluster analysis of particle size distributions (PSDs; size range 8-500 nm) simultaneously collected from three high Arctic sites during a 3-year period (2013-2015). Two sites are located in the Svalbard archipelago: Zeppelin research station (ZEP; 474 m above ground) and the nearby Gruvebadet Observatory (GRU; about 2 km distance from Zeppelin, 67 m above ground). The third site (Villum Research Station at Station Nord, VRS; 30 m above ground) is 600 km west-northwest of Zeppelin, at the tip of northeastern Greenland. The GRU site is included in an inter-site comparison for the first time. K-means cluster analysis provided eight specific aerosol categories, further combined into broad PSD classes with similar characteristics, namely pristine low concentrations (12 %-14 % occurrence), new particle formation (16 %-32 %), Aitken (21 %-35 %) and accumulation (20 %-50 %). Confined for longer time periods by consolidated pack sea ice regions, the Greenland site GRU shows PSDs with lower ultrafine-mode aerosol concentrations during summer but higher accumulation-mode aerosol concentrations during winter, relative to the Svalbard sites. By association with chemical composition and cloud condensation nuclei properties, further conclusions can be derived. Three distinct types of accumulation-mode aerosol are observed during winter months. These are associated with sea spray (largest detectable sizes, > 400 nm), Arctic haze (main mode at 150 nm) and aged accumulation-mode (main mode at 220 nm) aerosols. In contrast, locally produced particles, most likely of marine biogenic origin, exhibit size distributions dominated by the nucleation and Aitken mode during summer months. The obtained data and analysis point towards future studies, including apportioning the relative contribution of primary and secondary aerosol formation pro cesses and elucidating anthropogenic aerosol dynamics and transport and removal processes across the Greenland Sea. In order to address important research questions in the Arctic on scales beyond a singular station or measurement events, it is imperative to continue strengthening international scientific cooperationThis research has been supported by the Spanish Ministry of Economy through project BIO-NUC (CGL2013-49020-R), PI-ICE (CTM2017-89117-R) and the Ramon y Cajal fellowship (RYC-2012-11922). The research leading to these results has received funding from the European Union's Horizon 2020 research and innovation programme under grant agreement no. 654109, the Danish Council for Independent Research (project NUMEN, DFF-FTP-4005-00485B) and previously from the European Union Seventh Framework Programme (FP7/2007-2013) under grant agreement no. 262254. The work at Villum Research Station, Station Nord, was financially supported by the Danish Environmental Protection Agency via the MIKA/DANCEA funds for Environmental Support to the Arctic Region. The Villum Foundation funded the construction of Villum Research Station, Station Nord. CCN measurements are supported by a KOPRI program (PN19081), funded by a National Research Foundation of Korea grant (NRF-2016M1A5A1901769). The authors acknowledge financial support (to David C. S. Beddows) from the Natural Environment Research Council's funding of the National Centre for Atmospheric Science (NCAS) (grant number R8/H12/83/011)Peer Reviewe
On the conservation of the slow conformational dynamics within the amino acid kinase family: NAGK the paradigm
N-Acetyl-L-Glutamate Kinase (NAGK) is the structural paradigm for examining the catalytic mechanisms and dynamics of amino acid kinase family members. Given that the slow conformational dynamics of the NAGK (at the microseconds time scale or slower) may be rate-limiting, it is of importance to assess the mechanisms of the most cooperative modes of motion intrinsically accessible to this enzyme. Here, we present the results from normal mode analysis using an elastic network model representation, which shows that the conformational mechanisms for substrate binding by NAGK strongly correlate with the intrinsic dynamics of the enzyme in the unbound form. We further analyzed the potential mechanisms of allosteric signalling within NAGK using a Markov model for network communication. Comparative analysis of the dynamics of family members strongly suggests that the low-frequency modes of motion and the associated intramolecular couplings that establish signal transduction are highly conserved among family members, in support of the paradigm sequence→structure→dynamics→function © 2010 Marcos et al
BICEP: a large angular scale CMB polarimeter
We describe the design and expected performance of BICEP, a millimeter wave receiver designed to measure the polarization of the cosmic microwave background. BICEP uses an array of polarization sensitive bolometers operating at 100 and 150 GHz to measure polarized signals over a 20 degree field of view with 1 degree resolution. BICEP is designed with particular attention to systematic effects which can potentially degrade the polarimetric fidelity of the observations. BICEP is optimized to detect the faint signature of a primordial gravitational wave background which is a generic prediction of inflationary cosmologies
The Oomph in Economic Philosophy: A Bibliometric Analysis of the Main Trends, from the 1960s to the Present
- …