47 research outputs found
Recommended from our members
Efficient analysis and storage of large-scale genomic data
The impending advent of population-scaled sequencing cohorts involving tens of millions of individuals with matched phenotypic measurements will produce unprecedented volumes of genetic data. Storing and analysing such gargantuan datasets places computational performance at a pivotal position in medical genomics. In this thesis, I explore the potential for accelerating and parallelizing standard genetics workflows, file formats, and algorithms using both hardware-accelerated vectorization, parallel and distributed
algorithms, and heterogeneous computing.
First, I describe a novel bit-counting operation termed the positional population-count, which can be used together with succinct representations and standard efficient operations to accelerate many genetic calculations. In order to enable the use of this new operator and the canonical population count on any target machine I developed a unified low-level library using CPU dispatching to select the optimal method contingent on the available
instruction set architecture and the given input size at run-time. As a proof-of-principle application, I apply the positional population-count operator to computing quality control-related summary statistics for terabyte-scaled sequencing readsets with >3,800-fold speed improvements. As another application, I describe a framework for efficiently computing the cardinality of set intersection using these operators and applied this framework to efficiently compute genome-wide linkage-disequilibrium in datasets with up to 67 million samples resulting in up to >60-fold improvements in speed for dense genotypic vectors and up to >250,000-fold savings in memory and >100,000-fold improvement in speed for sparse genotypic vectors. I next describe a framework for handling the terabytes of compressed output data and describe graphical routines for visualizing long-range linkage-disequilibrium blocks as seen over many human centromeres. Finally, I describe efficient algorithms for storing and querying very large genetic datasets and specialized algorithms for the genotype component of such datasets with >10,000-fold savings in memory compared to the current interchange format.Wellcome Trus
Scaling and Universality in City Space Syntax: between Zipf and Matthew
We report about universality of rank-integration distributions of open spaces
in city space syntax similar to the famous rank-size distributions of cities
(Zipf's law). We also demonstrate that the degree of choice an open space
represents for other spaces directly linked to it in a city follows a power law
statistic. Universal statistical behavior of space syntax measures uncovers the
universality of the city creation mechanism. We suggest that the observed
universality may help to establish the international definition of a city as a
specific land use pattern.Comment: 24 pages, 5 *.eps figure
ToF-SIMS mediated analysis of human lung tissue reveals increased iron deposition in COPD (GOLD IV) patients
Chronic obstructive pulmonary disease (COPD) is a debilitating lung disease that is currently the third leading cause of death worldwide. Recent reports have indicated that dysfunctional iron handling in the lungs of COPD patients may be one contributing factor. However, a number of these studies have been limited to the qualitative assessment of iron levels through histochemical staining or to the expression levels of iron-carrier proteins in cells or bronchoalveolar lavage fluid. In this study, we have used time of flight secondary ion mass spectrometry (ToF-SIMS) to visualize and relatively quantify iron accumulation in lung tissue sections of healthy donors versus severe COPD patients. An IONTOF 5 instrument was used to perform the analysis, and further multivariate analysis was used to analyze the data. An orthogonal partial least squares discriminant analysis (OPLS-DA) score plot revealed good separation between the two groups. This separation was primarily attributed to differences in iron content, as well as differences in other chemical signals possibly associated with lipid species. Further, relative quantitative analysis revealed twelve times higher iron levels in lung tissue sections of COPD patients when compared to healthy donors. In addition, iron accumulation observed within the cells was heterogeneously distributed, indicating cellular compartmentalization
Correlative High-Resolution Imaging of Iron Uptake in Lung Macrophages
Detection of iron at the subcellular level in order to gain insights into its transport, storage, and therapeutic prospects to prevent cytotoxic effects of excessive iron accumulation is still a challenge. Nanoscale magnetic sector secondary ion mass spectrometry (SIMS) is an excellent candidate for subcellular mapping of elements in cells since it provides high secondary ion collection efficiency and transmission, coupled with high-lateral-resolution capabilities enabled by nanoscale primary ion beams. In this study, we developed correlative methodologies that implement SIMS high-resolution imaging technologies to study accumulation and determine subcellular localization of iron in alveolar macrophages. We employed transmission electron microscopy (TEM) and backscattered electron (BSE) microscopy to obtain structural information and high-resolution analytical tools, NanoSIMS and helium ion microscopy-SIMS (HIM-SIMS) to trace the chemical signature of iron. Chemical information from NanoSIMS was correlated with TEM data, while high-spatial-resolution ion maps from HIM-SIMS analysis were correlated with BSE structural information of the cell. NanoSIMS revealed that iron is accumulating within mitochondria, and both NanoSIMS and HIM-SIMS showed accumulation of iron in electrolucent compartments such as vacuoles, lysosomes, and lipid droplets. This study provides insights into iron metabolism at the subcellular level and has future potential in finding therapeutics to reduce the cytotoxic effects of excessive iron loading
Subcellular Mass Spectrometry Imaging and Absolute Quantitative Analysis across Organelles
Mass spectrometry imaging is a field that promises to become a mainstream bioanalysis technology by allowing the combination of single-cell imaging and subcellular quantitative analysis. The frontier of single-cell imaging has advanced to the point where it is now possible to compare the chemical contents of individual organelles in terms of raw or normalized ion signal. However, to realize the full potential of this technology, it is necessary to move beyond this concept of relative quantification. Here we present a nanoSIMS imaging method that directly measures the absolute concentration of an organelle-associated, isotopically labeled, pro-drug directly from a mass spectrometry image. This is validated with a recently developed nanoelectrochemistry method for single organelles. We establish a limit of detection based on the number of isotopic labels used and the volume of the organelle of interest, also offering this calculation as a web application. This approach allows subcellular quantification of drugs and metabolites, an overarching and previously unmet goal in cell science and pharmaceutical development
Epigenetic analysis of regulatory T cells using multiplex bisulfite sequencing.
This work was supported by Wellcome Trust Grant 096388, JDRF Grant 9-2011-253, the National Institute for Health Research Cambridge Biomedical Research Centre (BRC) and Award P01AI039671 (to LSW. and JAT.) from the National Institute of Allergy and Infectious Diseases (NIAID). CW is supported by the Wellcome Trust (089989). The content of this article is solely the responsibility of the authors and does not necessarily represent the official views of NIAID or the National Institutes of Health. The Cambridge Institute for Medical Research is in receipt of Wellcome Trust Strategic Award 100140. We gratefully acknowledge the participation of all NIHR Cambridge BioResource volunteers. We thank the Cambridge BioResource staff for their help with volunteer recruitment. We thank members of the Cambridge BioResource SAB and Management Committee for their support of our study and the National Institute for Health Research Cambridge Biomedical Research Centre for funding. We thank Fay Rodger and Ruth Littleboy for running the Illumina MiSeq in the Molecular Genetics Laboratories, Addenbrooke's Hospital, Cambridge. This research was supported by the Cambridge NIHR BRC Cell Phenotyping Hub. In particular, we wish to thank Anna Petrunkina Harrison, Simon McCallum, Christopher Bowman, Natalia Savinykh, Esther Perez and Jelena Markovic Djuric for their advice and support in cell sorting. We also thank Helen Stevens, Pamela Clarke, Gillian Coleman, Sarah Dawson, Jennifer Denesha, Simon Duley, Meeta Maisuria-Armer and Trupti Mistry for acquisition and preparation of samples.This is the final version of the article. It first appeared from Wiley via http://dx.doi.org/10.1002/eji.20154564
Genetics of myocardial interstitial fibrosis in the human heart and association with disease
Myocardial interstitial fibrosis is associated with cardiovascular disease and adverse prognosis. Here, to investigate the biological pathways that underlie fibrosis in the human heart, we developed a machine learning model to measure native myocardial T1 time, a marker of myocardial fibrosis, in 41,505 UK Biobank participants who underwent cardiac magnetic resonance imaging. Greater T1 time was associated with diabetes mellitus, renal disease, aortic stenosis, cardiomyopathy, heart failure, atrial fibrillation, conduction disease and rheumatoid arthritis. Genome-wide association analysis identified 11 independent loci associated with T1 time. The identified loci implicated genes involved in glucose transport (SLC2A12), iron homeostasis (HFE, TMPRSS6), tissue repair (ADAMTSL1, VEGFC), oxidative stress (SOD2), cardiac hypertrophy (MYH7B) and calcium signaling (CAMK2D). Using a transforming growth factor β1-mediated cardiac fibroblast activation assay, we found that 9 of the 11 loci consisted of genes that exhibited temporal changes in expression or open chromatin conformation supporting their biological relevance to myofibroblast cell state acquisition. By harnessing machine learning to perform large-scale quantification of myocardial interstitial fibrosis using cardiac imaging, we validate associations between cardiac fibrosis and disease, and identify new biologically relevant pathways underlying fibrosis.</p
Peat growth and carbon accumulation rates during the holocene in boreal mires
This thesis is based on the analysis of peat stratigraphies to study peat growth and carbon accumulation processes in northern mires. In the first study, problems concerning l4C dating of peat were examined by fractionation of bulk peat samples and l4C AMS dating of the separate fractions. In the following studies, peat cores from twelve Swedish mire sites were investigated. Macrofossil analysis was performed on the sampled cores to describe and classify the plant communities during mire development. Between 6 to 18 l4C AMS datings were performed on one core from each mire in order to estimate the peat growth and carbon accumulation rates for the identified plant communities.
Different fractions within single peat bulk samples gave considerably differing l4C ages. The range in age differed between mire types and depth. For accurate l4C dating, moss-stems, preferably of Sphagnum spp. are recommended (Paper I). Both autogenic and allogenic factors, e.g. climate and developmental stage, respectively, were identified as important influences on carbon accumulation (Paper II). Both peat growth and carbon accumulation rates differed between plant communities. The major factors explaining the variations in accumulation rates of the different plant communities were the amount of Carex and Sphagnum remains and the geographical position of the mire (Paper IV). Carbon accumulation rates decrease along with development in most mires. The results indicate that some mires may have alternated between being carbon sinks and sources, at least over the last several hundred years. The inter-annual variation in carbon accumulation is probably explained by climatic variations (Paper III)
"It is not cool to write long and good texts" : How teachers work with the writing skill and differences between boys and girls
Studien syftar till att undersöka hur lärare arbetar med skrivförmågan i årskurs 4–6 och om de ser några skillnader inom skrivförmågan mellan pojkar och flickor samt hur de i så fall motverkar dessa skillnader. Nejman (2020) menar att skillnaderna i betyg mellan pojkar och flickor är ett av skolans största likvärdighetsproblem. Resultatet i studien bygger på åtta kvalitativa intervjuer av semistrukturerad karaktär med lärare som är yrkesverksamma och antingen undervisar eller har undervisat i svenskämnet. Resultatet diskuteras genom det sociokulturella perspektivet och vad tidigare forskning säger om ämnet. Undersökningens resultat visar att de flesta lärare i denna studie arbetar med genrepedagogik i klassrummet. Det visar också att lärarlegitimation spelar roll i relation till användandet av arbetssätt och metoder för skrivförmågan. En viktig faktor som nämns av flera av lärarna i studien är att läsning stärker skrivförmågan. Att det finns skillnader mellan pojkar och flickor inom skrivförmågan är något som både forskning och resultatet av intervjuerna visar, däremot är det få lärare som uttrycker att de använder sig av några arbetssätt i klassrummet för att minska dessa skillnader. Det som är av betydelse menar lärarna är klassrumsplaceringen