10,994 research outputs found

    A statistical method for estimating activity uncertainty parameters to improve project forecasting

    Get PDF
    Just like any physical system, projects have entropy that must be managed by spending energy. The entropy is the project’s tendency to move to a state of disorder (schedule delays, cost overruns), and the energy process is an inherent part of any project management methodology. In order to manage the inherent uncertainty of these projects, accurate estimates (for durations, costs, resources, …) are crucial to make informed decisions. Without these estimates, managers have to fall back to their own intuition and experience, which are undoubtedly crucial for making decisions, but are are often subject to biases and hard to quantify. This paper builds further on two published calibration methods that aim to extract data from real projects and calibrate them to better estimate the parameters for the probability distributions of activity durations. Both methods rely on the lognormal distribution model to estimate uncertainty in activity durations and perform a sequence of statistical hypothesis tests that take the possible presence of two human biases into account. Based on these two existing methods, a new so-called statistical partitioning heuristic is presented that integrates the best elements of the two methods to further improve the accuracy of estimating the distribution of activity duration uncertainty. A computational experiment has been carried out on an empirical database of 83 empirical projects. The experiment shows that the new statistical partitioning method performs at least as good as, and often better than, the two existing calibration methods. The improvement will allow a better quantification of the activity duration uncertainty, which will eventually lead to a better prediction of the project schedule and more realistic expectations about the project outcomes. Consequently, the project manager will be able to better cope with the inherent uncertainty (entropy) of projects with a minimum managerial effort (energy)

    Characterizing the Quantum Confined Stark Effect in Semiconductor Quantum Dots and Nanorods for Single-Molecule Electrophysiology

    Get PDF
    We optimized the performance of quantum confined Stark effect QCSE based voltage nanosensors. A high throughput approach for single particle QCSE characterization was developed and utilized to screen a library of such nanosensors. Type II ZnSe CdS seeded nanorods were found to have the best performance among the different nanosensors evaluated in this work. The degree of correlation between intensity changes and spectral changes of the excitons emission under applied field was characterized. An upper limit for the temporal response of individual ZnSe CdS nanorods to voltage modulation was characterized by high throughput, high temporal resolution intensity measurements using a novel photon counting camera. The measured 3.5 us response time is limited by the voltage modulation electronics and represents about 30 times higher bandwidth than needed for recording an action potential in a neuron.Comment: 36 pages, 6 figure

    The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning

    Full text link
    The nascent field of fair machine learning aims to ensure that decisions guided by algorithms are equitable. Over the last several years, three formal definitions of fairness have gained prominence: (1) anti-classification, meaning that protected attributes---like race, gender, and their proxies---are not explicitly used to make decisions; (2) classification parity, meaning that common measures of predictive performance (e.g., false positive and false negative rates) are equal across groups defined by the protected attributes; and (3) calibration, meaning that conditional on risk estimates, outcomes are independent of protected attributes. Here we show that all three of these fairness definitions suffer from significant statistical limitations. Requiring anti-classification or classification parity can, perversely, harm the very groups they were designed to protect; and calibration, though generally desirable, provides little guarantee that decisions are equitable. In contrast to these formal fairness criteria, we argue that it is often preferable to treat similarly risky people similarly, based on the most statistically accurate estimates of risk that one can produce. Such a strategy, while not universally applicable, often aligns well with policy objectives; notably, this strategy will typically violate both anti-classification and classification parity. In practice, it requires significant effort to construct suitable risk estimates. One must carefully define and measure the targets of prediction to avoid retrenching biases in the data. But, importantly, one cannot generally address these difficulties by requiring that algorithms satisfy popular mathematical formalizations of fairness. By highlighting these challenges in the foundation of fair machine learning, we hope to help researchers and practitioners productively advance the area

    Proceedings of the 2011 New York Workshop on Computer, Earth and Space Science

    Full text link
    The purpose of the New York Workshop on Computer, Earth and Space Sciences is to bring together the New York area's finest Astronomers, Statisticians, Computer Scientists, Space and Earth Scientists to explore potential synergies between their respective fields. The 2011 edition (CESS2011) was a great success, and we would like to thank all of the presenters and participants for attending. This year was also special as it included authors from the upcoming book titled "Advances in Machine Learning and Data Mining for Astronomy". Over two days, the latest advanced techniques used to analyze the vast amounts of information now available for the understanding of our universe and our planet were presented. These proceedings attempt to provide a small window into what the current state of research is in this vast interdisciplinary field and we'd like to thank the speakers who spent the time to contribute to this volume.Comment: Author lists modified. 82 pages. Workshop Proceedings from CESS 2011 in New York City, Goddard Institute for Space Studie

    Exclusion Limits on the WIMP-Nucleon Cross-Section from the First Run of the Cryogenic Dark Matter Search in the Soudan Underground Lab

    Full text link
    The Cryogenic Dark Matter Search (CDMS-II) employs low-temperature Ge and Si detectors to seek Weakly Interacting Massive Particles (WIMPs) via their elastic scattering interactions with nuclei. Simultaneous measurements of both ionization and phonon energy provide discrimination against interactions of background particles. For recoil energies above 10 keV, events due to background photons are rejected with >99.99% efficiency. Electromagnetic events very near the detector surface can mimic nuclear recoils because of reduced charge collection, but these surface events are rejected with >96% efficiency by using additional information from the phonon pulse shape. Efficient use of active and passive shielding, combined with the the 2090 m.w.e. overburden at the experimental site in the Soudan mine, makes the background from neutrons negligible for this first exposure. All cuts are determined in a blind manner from in situ calibrations with external radioactive sources without any prior knowledge of the event distribution in the signal region. Resulting efficiencies are known to ~10%. A single event with a recoil of 64 keV passes all of the cuts and is consistent with the expected misidentification rate of surface-electron recoils. Under the assumptions for a standard dark matter halo, these data exclude previously unexplored parameter space for both spin-independent and spin-dependent WIMP-nucleon elastic scattering. The resulting limit on the spin-independent WIMP-nucleon elastic-scattering cross-section has a minimum of 4x10^-43 cm^2 at a WIMP mass of 60 GeV/c^2. The minimum of the limit for the spin-dependent WIMP-neutron elastic-scattering cross-section is 2x10^-37 cm^2 at a WIMP mass of 50 GeV/c^2.Comment: 37 pages, 42 figure

    Data mining based cyber-attack detection

    Get PDF

    Single Cell Proteomics in Biomedicine: High-dimensional Data Acquisition, Visualization and Analysis

    Get PDF
    New insights on cellular heterogeneity in the last decade provoke the development of a variety of single cell omics tools at a lightning pace. The resultant high-dimensional single cell data generated by these tools require new theoretical approaches and analytical algorithms for effective visualization and interpretation. In this review, we briefly survey the state-of-the-art single cell proteomic tools with a particular focus on data acquisition and quantification, followed by an elaboration of a number of statistical and computational approaches developed to date for dissecting the high-dimensional single cell data. The underlying assumptions, unique features, and limitations of the analytical methods with the designated biological questions they seek to answer will be discussed. Particular attention will be given to those information theoretical approaches that are anchored in a set of first principles of physics and can yield detailed (and often surprising) predictions

    Reliable ABC model choice via random forests

    Full text link
    Approximate Bayesian computation (ABC) methods provide an elaborate approach to Bayesian inference on complex models, including model choice. Both theoretical arguments and simulation experiments indicate, however, that model posterior probabilities may be poorly evaluated by standard ABC techniques. We propose a novel approach based on a machine learning tool named random forests to conduct selection among the highly complex models covered by ABC algorithms. We thus modify the way Bayesian model selection is both understood and operated, in that we rephrase the inferential goal as a classification problem, first predicting the model that best fits the data with random forests and postponing the approximation of the posterior probability of the predicted MAP for a second stage also relying on random forests. Compared with earlier implementations of ABC model choice, the ABC random forest approach offers several potential improvements: (i) it often has a larger discriminative power among the competing models, (ii) it is more robust against the number and choice of statistics summarizing the data, (iii) the computing effort is drastically reduced (with a gain in computation efficiency of at least fifty), and (iv) it includes an approximation of the posterior probability of the selected model. The call to random forests will undoubtedly extend the range of size of datasets and complexity of models that ABC can handle. We illustrate the power of this novel methodology by analyzing controlled experiments as well as genuine population genetics datasets. The proposed methodologies are implemented in the R package abcrf available on the CRAN.Comment: 39 pages, 15 figures, 6 table
    • …
    corecore