167 research outputs found

    Distributed Management of Massive Data: an Efficient Fine-Grain Data Access Scheme

    Get PDF
    This paper addresses the problem of efficiently storing and accessing massive data blocks in a large-scale distributed environment, while providing efficient fine-grain access to data subsets. This issue is crucial in the context of applications in the field of databases, data mining and multimedia. We propose a data sharing service based on distributed, RAM-based storage of data, while leveraging a DHT-based, natively parallel metadata management scheme. As opposed to the most commonly used grid storage infrastructures that provide mechanisms for explicit data localization and transfer, we provide a transparent access model, where data are accessed through global identifiers. Our proposal has been validated through a prototype implementation whose preliminary evaluation provides promising results

    A type system for statically detecting spreadsheet errors

    Get PDF

    Spark versus Flink: Understanding Performance in Big Data Analytics Frameworks

    Get PDF
    International audienceBig Data analytics has recently gained increasing popularity as a tool to process large amounts of data on-demand. Spark and Flink are two Apache-hosted data analytics frameworks that facilitate the development of multi-step data pipelines using directly acyclic graph patterns. Making the most out of these frameworks is challenging because efficient executions strongly rely on complex parameter configurations and on an in-depth understanding of the underlying architectural choices. Although extensive research has been devoted to improving and evaluating the performance of such analytics frameworks, most of them benchmark the platforms against Hadoop, as a baseline, a rather unfair comparison considering the fundamentally different design principles. This paper aims to bring some justice in this respect, by directly evaluating the performance of Spark and Flink. Our goal is to identify and explain the impact of the different architectural choices and the parameter configurations on the perceived end-to-end performance. To this end, we develop a methodology for correlating the parameter settings and the operators execution plan with the resource usage. We use this methodology to dissect the performance of Spark and Flink with several representative batch and iterative workloads on up to 100 nodes. Our key finding is that there none of the two framework outperforms the other for all data types, sizes and job patterns. This paper performs a fine characterization of the cases when each framework is superior, and we highlight how this performance correlates to operators, to resource usage and to the specifics of the internal framework design

    Parallel and Distributed Data Management. Introduction

    Full text link
    The manipulation and handling of an ever increasing volume of data by current data-intensive applications require novel techniques for e?cient data management. Despite recent advances in every aspect of data management (storage, access, querying, analysis, mining), future applications are expected to scale to even higher degrees, not only in terms of volumes of data handled but also in terms of users and resources, often making use of multiple, pre-existing autonomous, distributed or heterogeneous resources

    How to bring together fault tolerance and data consistency to enable Grid data sharing

    Get PDF
    One of the predominant themes in the criminal justice literature is that prosecutors dominate the justice system. Over seventy-five years ago, Attorney General Robert Jackson famously proclaimed that the “prosecutor has more control over life, liberty, and reputation than any other person in America.” In one of the most cited law review articles of all time, Bill Stuntz added that prosecutors—not legislators, judges, or police—“are the criminal justice system’s real lawmakers.” And an unchallenged modern consensus holds that prosecutors “rule the criminal justice system.” This Article applies a critical lens to longstanding claims of prosecutorial preeminence. It reveals a curious echo chamber enabled by a puzzling lack of dissent. With few voices challenging ever-more-strident prosecutor-dominance rhetoric, academic claims became uncritical, imprecise, and ultimately incorrect. An unchallenged consensus that “prosecutors are the criminal justice system” and that the “institution of the prosecutor has more power than any other in the criminal justice system” has real consequences for criminal justice discourse. Portraying prosecutors as the system’s iron-fisted rulers obscures the complex interplay that actually determines criminal justice outcomes. The overheated rhetoric of prosecutorial preeminence fosters a superficial understanding of the criminal justice system, overlooks the powerful forces that can and do constrain prosecutors, and diverts attention from the most promising sources of reform (legislators, judges, and police) to the least (prosecutors)

    Tyr: Blob Storage Meets Built-In Transactions

    Get PDF
    International audienceConcurrent Big Data applications often require high-performance storage, as well as ACID (Atomicity, Consistency , Isolation, Durability) transaction support. Although blobs (binary large objects) are an increasingly popular model for addressing the storage needs of such applications, state-of-the-art blob storage systems typically offer no transaction semantics. This demands users to coordinate access to data carefully in order to avoid race conditions, inconsistent writes, overwrites and other problems that cause erratic behavior. We argue there is a gap between existing storage solutions and application requirements, which limits the design of transaction-oriented applications. We introduce Tyr , the first blob storage system to provide built-in, multiblob transactions, while retaining sequential consistency and high throughput under heavy access concurrency. Tyr offers fine-grained random write access to data and in-place atomic operations. Large-scale experiments on Microsoft Azure with a production application from CERN LHC show Tyr throughput outperforming state-of-the-art solutions by more than 75%

    Pharmacologic modulation of experimentally induced allergic asthma

    Get PDF
    Allergic asthma is the most frequent disease of the respiratory tract. The aim of the current experimental and clinical studies was to find new sources of drugs able to control asthmatic inflammation and airway hyperresponsiveness. Our experimental studies were focused on efficiency evaluation of substances able to influence activities of ion channels, phosphodiesterase (PDE) isoforms, substances from the group of polyphenols and NO metabolism modulators during experimentally induced allergic asthma

    Serotonin and corticosterone rhythms in mice exposed to cigarette smoke and in patients with COPD:implication for COPD-associated neuropathogenesis

    Get PDF
    The circadian timing system controls daily rhythms of physiology and behavior, and disruption of clock function can trigger stressful life events. Daily exposure to cigarette smoke (CS) can lead to alteration in diverse biological and physiological processes. Smoking is associated with mood disorders, including depression and anxiety. Patients with chronic obstructive pulmonary disease (COPD) have abnormal circadian rhythms, reflected by daily changes in respiratory symptoms and lung function. Corticosterone (CORT) is an adrenal steroid that plays a considerable role in stress and anti-inflammatory responses. Serotonin (5-hydroxytryptamine; 5HT) is a neurohormone, which plays a role in sleep/wake regulation and affective disorders. Secretion of stress hormones (CORT and 5HT) is under the control of the circadian clock in the suprachiasmatic nucleus. Since smoking is a contributing factor in the development of COPD, we hypothesize that CS can affect circadian rhythms of CORT and 5HT secretion leading to sleep and mood disorders in smokers and patients with COPD. We measured the daily rhythms of plasma CORT and 5HT in mice following acute (3 d), sub-chronic (10 d) or chronic (6 mo) CS exposure and in plasma from non-smokers, smokers and patients with COPD. Acute and chronic CS exposure affected both the timing (peak phase) and amplitude of the daily rhythm of plasma CORT and 5HT in mice. Acute CS appeared to have subtle time-dependent effects on CORT levels but more pronounced effects on 5HT. As compared with CORT, plasma 5HT was slightly elevated in smokers but was reduced in patients with COPD. Thus, the effects of CS on plasma 5HT were consistent between mice and patients with COPD. Together, these data reveal a significant impact of CS exposure on rhythms of stress hormone secretion and subsequent detrimental effects on cognitive function, depression-like behavior, mood/anxiety and sleep quality in smokers and patients with COPD

    Nustar and Chandra Insight into the Nature of the 3-40 Kev Nuclear Emission in Ngc 253

    Get PDF
    We present results from three nearly simultaneous Nuclear Spectroscopic Telescope Array (NuSTAR) and Chandra monitoring observations between 2012 September 2 and 2012 November 16 of the local star-forming galaxy NGC 253. The 3-40 kiloelectron volt intensity of the inner approximately 20 arcsec (approximately 400 parsec) nuclear region, as measured by NuSTAR, varied by a factor of approximately 2 across the three monitoring observations. The Chandra data reveal that the nuclear region contains three bright X-ray sources, including a luminous (L (sub 2-10 kiloelectron volt) approximately few 10 (exp 39) erg per s) point source located approximately 1 arcsec from the dynamical center of the galaxy (within the sigma 3 positional uncertainty of the dynamical center); this source drives the overall variability of the nuclear region at energies greater than or approximately equal to 3 kiloelectron volts. We make use of the variability to measure the spectra of this single hard X-ray source when it was in bright states. The spectra are well described by an absorbed (power-law model spectral fit value, N(sub H), approximately equal to 1.6 x 10 (exp 23) per square centimeter) broken power-law model with spectral slopes and break energies that are typical of ultraluminous X-ray sources (ULXs), but not active galactic nuclei (AGNs). A previous Chandra observation in 2003 showed a hard X-ray point source of similar luminosity to the 2012 source that was also near the dynamical center (Phi is approximately equal to 0.4 arcsec); however, this source was offset from the 2012 source position by approximately 1 arcsec. We show that the probability of the 2003 and 2012 hard X-ray sources being unrelated is much greater than 99.99% based on the Chandra spatial localizations. Interestingly, the Chandra spectrum of the 2003 source (3-8 kiloelectron volts) is shallower in slope than that of the 2012 hard X-ray source. Its proximity to the dynamical center and harder Chandra spectrum indicate that the 2003 source is a better AGN candidate than any of the sources detected in our 2012 campaign; however, we were unable to rule out a ULX nature for this source. Future NuSTAR and Chandra monitoring would be well equipped to break the degeneracy between the AGN and ULX nature of the 2003 source, if again caught in a high state
    corecore