77 research outputs found
MalStone: Towards A Benchmark for Analytics on Large Data Clouds
Developing data mining algorithms that are suitable for cloud computing
platforms is currently an active area of research, as is developing cloud
computing platforms appropriate for data mining. Currently, the most common
benchmark for cloud computing is the Terasort (and related) benchmarks.
Although the Terasort Benchmark is quite useful, it was not designed for data
mining per se. In this paper, we introduce a benchmark called MalStone that is
specifically designed to measure the performance of cloud computing middleware
that supports the type of data intensive computing common when building data
mining models. We also introduce MalGen, which is a utility for generating data
on clouds that can be used with MalStone
Recommended from our members
ViroFind: A novel target-enrichment deep-sequencing platform reveals a complex JC virus population in the brain of PML patients
Deep nucleotide sequencing enables the unbiased, broad-spectrum detection of viruses in clinical samples without requiring an a priori hypothesis for the source of infection. However, its use in clinical research applications is limited by low cost-effectiveness given that most of the sequencing information from clinical samples is related to the human genome, which renders the analysis of viral genomes challenging. To overcome this limitation we developed ViroFind, an in-solution target-enrichment platform for virus detection and discovery in clinical samples. ViroFind comprises 165,433 viral probes that cover the genomes of 535 selected DNA and RNA viruses that infect humans or could cause zoonosis. The ViroFind probes are used in a hybridization reaction to enrich viral sequences and therefore enhance the detection of viral genomes via deep sequencing. We used ViroFind to detect and analyze all viral populations in the brain of 5 patients with progressive multifocal leukoencephalopathy (PML) and of 18 control subjects with no known neurological disease. Compared to direct deep sequencing, by using ViroFind we enriched viral sequences present in the clinical samples up to 127-fold. We discovered highly complex polyoma virus JC populations in the PML brain samples with a remarkable degree of genetic divergence among the JC virus variants of each PML brain sample. Specifically for the viral capsid protein VP1 gene, we identified 24 single nucleotide substitutions, 12 of which were associated with amino acid changes. The most frequent (4 of 5 samples, 80%) amino acid change was D66H, which is associated with enhanced tissue tropism, and hence likely a viral fitness advantage, compared to other variants. Lastly, we also detected sparse JC virus sequences in 10 of 18 (55.5%) of control samples and sparse human herpes virus 6B (HHV6B) sequences in the brain of 11 of 18 (61.1%) control subjects. In sum, ViroFind enabled the in-depth analysis of all viral genomes in PML and control brain samples and allowed us to demonstrate a high degree of JC virus genetic divergence in vivo that has been previously underappreciated. ViroFind can be used to investigate the structure of the virome with unprecedented depth in health and disease state
Drum training induces long-term plasticity in the cerebellum and connected cortical thickness
It is unclear to what extent cerebellar networks show long-term plasticity and accompanied changes in cortical structures. Using drumming as a demanding multimodal motor training, we compared cerebellar lobular volume and white matter microstructure, as well as cortical thickness of 15 healthy non-musicians before and after learning to drum, and 16 age matched novice control participants. After 8 weeks of group drumming instruction, 3 ×30 minutes per week, we observed the cerebellum significantly changing its grey (volume increase of left VIIIa, relative decrease of VIIIb and vermis Crus I volume) and white matter microstructure in the inferior cerebellar peduncle. These plastic cerebellar changes were complemented by changes in cortical thickness (increase in left paracentral, right precuneus and right but not left superior frontal thickness), suggesting an interplay of cerebellar learning with cortical structures enabled through cerebellar pathways
Research priorities in hypertrophic cardiomyopathy: report of a Working Group of the National Heart, Lung, and Blood Institute.
Hypertrophic cardiomyopathy (HCM) is a myocardial disorder characterized by left ventricular (LV) hypertrophy without dilatation and without apparent cause (ie, it occurs in the absence of severe hypertension, aortic stenosis, or other cardiac or systemic diseases that might cause LV hypertrophy). Numerous excellent reviews and consensus documents provide a wealth of additional background.1–8 HCM is the leading cause of sudden death in young people and leads to significant disability in survivors. It is caused by mutations in genes that encode components of the sarcomere. Cardiomyocyte and cardiac hypertrophy, myocyte disarray, interstitial and replacement fibrosis, and dysplastic intramyocardial arterioles characterize the pathology of HCM. Clinical manifestations include impaired diastolic function, heart failure, tachyarrhythmia (both atrial and ventricular), and sudden death. At present, there is a lack of understanding of how the mutations in genes encoding sarcomere proteins lead to the phenotypes described above. Current therapeutic approaches have focused on the prevention of sudden death, with implantable cardioverter defibrillator placement in high-risk patients. But medical therapies have largely focused on alleviating symptoms of the disease, not on altering its natural history. The present Working Group of the National Heart, Lung, and Blood Institute brought together clinical, translational, and basic scientists with the overarching goal of identifying novel strategies to prevent the phenotypic expression of disease. Herein, we identify research initiatives that we hope will lead to novel therapeutic approaches for patients with HCM
De novo mutations in histone modifying genes in congenital heart disease
Congenital heart disease (CHD) is the most frequent birth defect, affecting 0.8% of live births1. Many cases occur sporadically and impair reproductive fitness, suggesting a role for de novo mutations. By analysis of exome sequencing of parent-offspring trios, we compared the incidence of de novo mutations in 362 severe CHD cases and 264 controls. CHD cases showed a significant excess of protein-altering de novo mutations in genes expressed in the developing heart, with an odds ratio of 7.5 for damaging mutations. Similar odds ratios were seen across major classes of severe CHD. We found a marked excess of de novo mutations in genes involved in production, removal or reading of H3K4 methylation (H3K4me), or ubiquitination of H2BK120, which is required for H3K4 methylation2–4. There were also two de novo mutations in SMAD2; SMAD2 signaling in the embryonic left-right organizer induces demethylation of H3K27me5. H3K4me and H3K27me mark `poised' promoters and enhancers that regulate expression of key developmental genes6. These findings implicate de novo point mutations in several hundred genes that collectively contribute to ~10% of severe CHD
Torture in Counterterrorism: Agency Incentives and Slippery Slopes
Abstract We develop a model of counterterrorism to analyze the effects of allowing a government agency to torture terrorist suspects. We find that legalizing torture in high evidence cases has offsetting effects on agency incentives to counter terrorism by means other than torture. It increases these incentives because other efforts may increase the probability of having high enough evidence to warrant the use of torture if other efforts fail. However, it also lowers these incentives because the agency might come to rely on torture to avert attacks. If the latter effect dominates, legalizing torture in high evidence cases can reduce security and increase the probability of terrorist attack. Moreover, it can increase agency incentives to torture even in low evidence cases, leading to a "slippery slope." (JEL K4, D8, H1
Clinical Sequencing Exploratory Research Consortium: Accelerating Evidence-Based Practice of Genomic Medicine
Despite rapid technical progress and demonstrable effectiveness for some types of diagnosis and therapy, much remains to be learned about clinical genome and exome sequencing (CGES) and its role within the practice of medicine. The Clinical Sequencing Exploratory Research (CSER) consortium includes 18 extramural research projects, one National Human Genome Research Institute (NHGRI) intramural project, and a coordinating center funded by the NHGRI and National Cancer Institute. The consortium is exploring analytic and clinical validity and utility, as well as the ethical, legal, and social implications of sequencing via multidisciplinary approaches; it has thus far recruited 5,577 participants across a spectrum of symptomatic and healthy children and adults by utilizing both germline and cancer sequencing. The CSER consortium is analyzing data and creating publically available procedures and tools related to participant preferences and consent, variant classification, disclosure and management of primary and secondary findings, health outcomes, and integration with electronic health records. Future research directions will refine measures of clinical utility of CGES in both germline and somatic testing, evaluate the use of CGES for screening in healthy individuals, explore the penetrance of pathogenic variants through extensive phenotyping, reduce discordances in public databases of genes and variants, examine social and ethnic disparities in the provision of genomics services, explore regulatory issues, and estimate the value and downstream costs of sequencing. The CSER consortium has established a shared community of research sites by using diverse approaches to pursue the evidence-based development of best practices in genomic medicine
Age at first birth in women is genetically associated with increased risk of schizophrenia
Prof. Paunio on PGC:n jäsenPrevious studies have shown an increased risk for mental health problems in children born to both younger and older parents compared to children of average-aged parents. We previously used a novel design to reveal a latent mechanism of genetic association between schizophrenia and age at first birth in women (AFB). Here, we use independent data from the UK Biobank (N = 38,892) to replicate the finding of an association between predicted genetic risk of schizophrenia and AFB in women, and to estimate the genetic correlation between schizophrenia and AFB in women stratified into younger and older groups. We find evidence for an association between predicted genetic risk of schizophrenia and AFB in women (P-value = 1.12E-05), and we show genetic heterogeneity between younger and older AFB groups (P-value = 3.45E-03). The genetic correlation between schizophrenia and AFB in the younger AFB group is -0.16 (SE = 0.04) while that between schizophrenia and AFB in the older AFB group is 0.14 (SE = 0.08). Our results suggest that early, and perhaps also late, age at first birth in women is associated with increased genetic risk for schizophrenia in the UK Biobank sample. These findings contribute new insights into factors contributing to the complex bio-social risk architecture underpinning the association between parental age and offspring mental health.Peer reviewe
The Changing Landscape for Stroke\ua0Prevention in AF: Findings From the GLORIA-AF Registry Phase 2
Background GLORIA-AF (Global Registry on Long-Term Oral Antithrombotic Treatment in Patients with Atrial Fibrillation) is a prospective, global registry program describing antithrombotic treatment patterns in patients with newly diagnosed nonvalvular atrial fibrillation at risk of stroke. Phase 2 began when dabigatran, the first non\u2013vitamin K antagonist oral anticoagulant (NOAC), became available. Objectives This study sought to describe phase 2 baseline data and compare these with the pre-NOAC era collected during phase 1. Methods During phase 2, 15,641 consenting patients were enrolled (November 2011 to December 2014); 15,092 were eligible. This pre-specified cross-sectional analysis describes eligible patients\u2019 baseline characteristics. Atrial fibrillation disease characteristics, medical outcomes, and concomitant diseases and medications were collected. Data were analyzed using descriptive statistics. Results Of the total patients, 45.5% were female; median age was 71 (interquartile range: 64, 78) years. Patients were from Europe (47.1%), North America (22.5%), Asia (20.3%), Latin America (6.0%), and the Middle East/Africa (4.0%). Most had high stroke risk (CHA2DS2-VASc [Congestive heart failure, Hypertension, Age 6575 years, Diabetes mellitus, previous Stroke, Vascular disease, Age 65 to 74 years, Sex category] score 652; 86.1%); 13.9% had moderate risk (CHA2DS2-VASc = 1). Overall, 79.9% received oral anticoagulants, of whom 47.6% received NOAC and 32.3% vitamin K antagonists (VKA); 12.1% received antiplatelet agents; 7.8% received no antithrombotic treatment. For comparison, the proportion of phase 1 patients (of N = 1,063 all eligible) prescribed VKA was 32.8%, acetylsalicylic acid 41.7%, and no therapy 20.2%. In Europe in phase 2, treatment with NOAC was more common than VKA (52.3% and 37.8%, respectively); 6.0% of patients received antiplatelet treatment; and 3.8% received no antithrombotic treatment. In North America, 52.1%, 26.2%, and 14.0% of patients received NOAC, VKA, and antiplatelet drugs, respectively; 7.5% received no antithrombotic treatment. NOAC use was less common in Asia (27.7%), where 27.5% of patients received VKA, 25.0% antiplatelet drugs, and 19.8% no antithrombotic treatment. Conclusions The baseline data from GLORIA-AF phase 2 demonstrate that in newly diagnosed nonvalvular atrial fibrillation patients, NOAC have been highly adopted into practice, becoming more frequently prescribed than VKA in Europe and North America. Worldwide, however, a large proportion of patients remain undertreated, particularly in Asia and North America. (Global Registry on Long-Term Oral Antithrombotic Treatment in Patients With Atrial Fibrillation [GLORIA-AF]; NCT01468701
The Somatic Genomic Landscape of Glioblastoma
We describe the landscape of somatic genomic alterations based on multi-dimensional and comprehensive characterization of more than 500 glioblastoma tumors (GBMs). We identify several novel mutated genes as well as complex rearrangements of signature receptors including EGFR and PDGFRA. TERT promoter mutations are shown to correlate with elevated mRNA expression, supporting a role in telomerase reactivation. Correlative analyses confirm that the survival advantage of the proneural subtype is conferred by the G-CIMP phenotype, and MGMT DNA methylation may be a predictive biomarker for treatment response only in classical subtype GBM. Integrative analysis of genomic and proteomic profiles challenges the notion of therapeutic inhibition of a pathway as an alternative to inhibition of the target itself. These data will facilitate the discovery of therapeutic and diagnostic target candidates, the validation of research and clinical observations and the generation of unanticipated hypotheses that can advance our molecular understanding of this lethal cancer
- …