154 research outputs found

    Using edit distance to analyse errors in a natural language to logic translation corpus

    Get PDF
    We have assembled a large corpus of student submissions to an automatic grading system, where the subject matter involves the translation of natural language sentences into propositional logic. Of the 2.3 million translation instances in the corpus, 286,000 (approximately 12%) are categorized as being in error. We want to understand the nature of the errors that students make, so that we can develop tools and supporting infrastructure that help students with the problems that these errors represent. With this aim in mind, this paper describes an analysis of a significant proportion of the data, using edit distance between incorrect answers and their corresponding correct solutions, and the associated edit sequences, as a means of organising the data and detecting categories of errors. We demonstrate that a large proportion of errors can be accounted for by means of a small number of relatively simple error types, and that the method draws attention to interesting phenomena in the data set

    Density Distribution Sunflower Plots

    Get PDF
    Density distribution sunflower plots are used to display high-density bivariate data. They are useful for data where a conventional scatter plot is difficult to read due to overstriking of the plot symbol. The x-y plane is subdivided into a lattice of regular hexagonal bins of width w specified by the user. The user also specifies the values of l, d, and k that affect the plot as follows. Individual observations are plotted when there are less than l observations per bin as in a conventional scatter plot. Each bin with from l to d observations contains a light sunflower. Other bins contain a dark sunflower. In a light sunflower each petal represents one observation. In a dark sunflower, each petal represents k observations. (A dark sunflower with p petals represents between /2-pk k and /2+pk k observations.) The user can control the sizes and colors of the sunflowers. By selecting appropriate colors and sizes for the light and dark sunflowers, plots can be obtained that give both the overall sense of the data density distribution as well as the number of data points in any given region. The use of this graphic is illustrated with data from the Framingham Heart Study. A documented Stata program, called sunflower, is available to draw these graphs. It can be downloaded from the Statistical Software Components archive at http://ideas.repec.org/c/boc/bocode/s430201.html . (Journal of Statistical Software 2003; 8 (3): 1-5. Posted at http://www.jstatsoft.org/index.php?vol=8 .)

    Appraising HEI-community Partnerships: Assessing Performance, Monitoring Progress, and Evaluating Impacts

    Get PDF
    Momentum of the creation of partnerships between higher education institutions (HEIs) and communities is strong. As their significance intensifies, the question of how to judge their value is garnering increasing attention. In this perspective article, we develop a framework for comprehensively appraising HEI-community partnerships. Constituent parts of the framework are unpacked, and application of the framework is then discussed. The appraisal framework provides a mechanism to document evidence of worth, and most importantly contributes to the continuous improvement and learning imperative of HEI-community partnerships

    Assessing the Performance of Higher Education Institution (HEI)- Community Partnerships

    Get PDF
    As partnerships between Higher Education Institutions (HEIs) and communities have taken on increased importance, greater attention has been paid to how these partnerships are formed, the manner in which they operate, and what they can accomplish. Assessing the performance of these partnerships is critical for accountability, transparency, and understanding their value. However, no performance assessment framework exists of HEI-community partnerships. In this paper we summarize scholarship on HEI-community partnerships and present a conceptual framework to assess their performance. The assessment framework provides a mechanism for continuous improvement. Practical considerations and future research directions conclude the paper

    Effect of CPOE User Interface Design on User-Initiated Access to Educational and Patient Information during Clinical Care

    Get PDF
    Objective: Authors evaluated whether displaying context sensitive links to infrequently accessed educational materials and patient information via the user interface of an inpatient computerized care provider order entry (CPOE) system would affect access rates to the materials. Design: The CPOE of Vanderbilt University Hospital (VUH) included "baseline” clinical decision support advice for safety and quality. Authors augmented this with seven new primarily educational decision support features. A prospective, randomized, controlled trial compared clinicians' utilization rates for the new materials via two interfaces. Control subjects could access study-related decision support from a menu in the standard CPOE interface. Intervention subjects received active notification when study-related decision support was available through context sensitive, visibly highlighted, selectable hyperlinks. Measurements: Rates of opportunities to access and utilization of study-related decision support materials from April 1999 through March 2000 on seven VUH Internal Medicine wards. Results: During 4,466 intervention subject-days, there were 240,504 (53.9/subject-day) opportunities for study-related decision support, while during 3,397 control subject-days, there were 178,235 (52.5/subject-day) opportunities for such decision support, respectively (p = 0.11). Individual intervention subjects accessed the decision support features at least once on 3.8% of subject-days logged on (278 responses); controls accessed it at least once on 0.6% of subject-days (18 responses), with a response rate ratio adjusted for decision support frequency of 9.17 (95% confidence interval 4.6-18, p < 0.0005). On average, intervention subjects accessed study-related decision support materials once every 16 days individually and once every 1.26 days in aggregate. Conclusion: Highlighting availability of context-sensitive educational materials and patient information through visible hyperlinks significantly increased utilization rates for study-related decision support when compared to "standard” VUH CPOE methods, although absolute response rates were lo

    The dynamical state of stellar structure in star-forming regions

    Full text link
    The fraction of star formation that results in bound star clusters is influenced by the density spectrum in which stars are formed and by the response of the stellar structure to gas expulsion. We analyse hydrodynamical simulations of turbulent fragmentation in star-forming regions to assess the dynamical properties of the resulting population of stars and (sub)clusters. Stellar subclusters are identified using a minimum spanning tree algorithm. When considering only the gravitational potential of the stars and ignoring the gas, we find that the identified subclusters are close to virial equilibrium (the typical virial ratio Q_vir~0.59, where virial equilibrium would be Q_vir~0.5). This virial state is a consequence of the low gas fractions within the subclusters, caused by the accretion of gas onto the stars and the accretion-induced shrinkage of the subclusters. Because the subclusters are gas-poor, up to a length scale of 0.1-0.2 pc at the end of the simulation, they are only weakly affected by gas expulsion. The fraction of subclusters that reaches the high density required to evolve to a gas-poor state increases with the density of the star-forming region. We extend this argument to star cluster scales, and suggest that the absence of gas indicates that the early disruption of star clusters due to gas expulsion (infant mortality) plays a smaller role than anticipated, and is potentially restricted to star-forming regions with low ambient gas densities. We propose that in dense star-forming regions, the tidal shocking of young star clusters by the surrounding gas clouds could be responsible for the early disruption. This `cruel cradle effect' would work in addition to disruption by gas expulsion. We suggest possible methods to quantify the relative contributions of both mechanisms.Comment: 13 pages, 10 figures; Accepted for publication in MNRA

    Trend differences in lower stratospheric water vapour between Boulder and the zonal mean and their role in understanding fundamental observational discrepancies

    Get PDF
    Trend estimates with different signs are reported in the literature for lower stratospheric water vapour considering the time period between the late 1980s and 2010. The NOAA (National Oceanic and Atmospheric Administration) frost point hygrometer (FPH) observations at Boulder (Colorado, 40.0° N, 105.2° W) indicate positive trends (about 0.1 to 0.45 ppmv decade<sup>−1</sup>). On the contrary, negative trends (approximately −0.2 to −0.1 ppmv decade<sup>−1</sup>) are derived from a merged zonal mean satellite data set for a latitude band around the Boulder latitude. Overall, the trend differences between the two data sets range from about 0.3 to 0.5 ppmv decade<sup>−1</sup>, depending on altitude. It has been proposed that a possible explanation for these discrepancies is a different temporal behaviour at Boulder and the zonal mean. In this work we investigate trend differences between Boulder and the zonal mean using primarily simulations from ECHAM/MESSy (European Centre for Medium-Range Weather Forecasts Hamburg/Modular Earth Submodel System) Atmospheric Chemistry (EMAC), WACCM (Whole Atmosphere Community Climate Model), CMAM (Canadian Middle Atmosphere Model) and CLaMS (Chemical Lagrangian Model of the Stratosphere). On shorter timescales we address this aspect also based on satellite observations from UARS/HALOE (Upper Atmosphere Research Satellite/Halogen Occultation Experiment), Envisat/MIPAS (Environmental Satellite/Michelson Interferometer for Passive Atmospheric Sounding) and Aura/MLS (Microwave Limb Sounder). Overall, both the simulations and observations exhibit trend differences between Boulder and the zonal mean. The differences are dependent on altitude and the time period considered. The model simulations indicate only small trend differences between Boulder and the zonal mean for the time period between the late 1980s and 2010. These are clearly not sufficient to explain the discrepancies between the trend estimates derived from the FPH observations and the merged zonal mean satellite data set. Unless the simulations underrepresent variability or the trend differences originate from smaller spatial and temporal scales than resolved by the model simulations, trends at Boulder for this time period should also be quite representative for the zonal mean and even other latitude bands. Trend differences for a decade of data are larger and need to be kept in mind when comparing results for Boulder and the zonal mean on this timescale. Beyond that, we find that the trend estimates for the time period between the late 1980s and 2010 also significantly differ among the simulations. They are larger than those derived from the merged satellite data set and smaller than the trend estimates derived from the FPH observations

    The Gaia-ESO Survey: Dynamical analysis of the L1688 region in Ophiuchus

    Get PDF
    The Gaia ESO Public Spectroscopic Survey (GES) is providing the astronomical community with high-precision measurements of many stellar parameters including radial velocities (RVs) of stars belonging to several young clusters and star-forming regions. One of the main goals of the young cluster observations is to study their dynamical evolution and provide insight into their future, revealing whether they will eventually disperse to populate the field rather than evolve into bound open clusters. In this paper we report the analysis of the dynamical state of L1688 in the ρ Ophiuchi molecular cloud using the dataset provided by the GES consortium. We performed the membership selection of the more than 300 objects observed. Using the presence of the lithium absorption and the location in the Hertzspung-Russell diagram, we identify 45 already known members and two new association members. We provide accurate RVs for all 47 confirmed members. A dynamical analysis, after accounting for unresolved binaries and errors, shows that the stellar surface population of L1688 has a velocity dispersion σ ~ 1.14 ± 0.35 km s-1 that is consistent with being in virial equilibrium and is bound with a ~80% probability. We also find a velocity gradient in the stellar surface population of ~1.0 km s-1 pc-1 in the northwest-southeast direction, which is consistent with that found for the pre-stellar dense cores, and we discuss the possibility of sequential and triggered star formation in L1688

    On the fraction of star formation occurring in bound stellar clusters

    Full text link
    We present a theoretical framework in which bound stellar clusters arise naturally at the high-density end of the hierarchy of the interstellar medium (ISM). Due to short free-fall times, these high-density regions achieve high local star formation efficiencies, enabling them to form bound clusters. Star-forming regions of lower density remain substructured and gas-rich, ending up unbound when the residual gas is expelled. Additionally, the tidal perturbation of star-forming regions by nearby, dense giant molecular clouds imposes a minimum density contrast required for the collapse to a bound cluster. The fraction of all star formation that occurs in bound stellar clusters (the cluster formation efficiency or CFE) follows by integration of these local clustering and survival properties over the full density spectrum of the ISM, and hence is set by galaxy-scale physics. We derive the CFE as a function of observable galaxy properties, and find that it increases with the gas surface density, from ~1% in low-density galaxies to a peak value of ~70% at densities of ~10^3 Msun pc^-2. This explains the observation that the CFE increases with the star formation rate density in nearby dwarf, spiral, and starburst galaxies. Indeed, comparing our model results with observed galaxies yields excellent agreement. The model is applied further by calculating the spatial variation of the CFE within single galaxies. We also consider the variation of the CFE with cosmic time and show that it increases with redshift, peaking in high-redshift, gas-rich disc galaxies. It is estimated that up to 30-35% of all stars in the Universe once formed in bound stellar clusters. We discuss how our theory can be verified with Gaia and ALMA, and provide implementations for future theoretical work and for simulations of galaxy formation and evolution.Comment: 35 pages, 14 figures, 3 tables, accepted by MNRAS (10 August 2012). Fortran and IDL routines for calculating the cluster formation efficiency are publicly available at http://www.mpa-garching.mpg.de/cf

    Copy Number Variants Are Ovarian Cancer Risk Alleles at Known and Novel Risk Loci

    Get PDF
    BACKGROUND: Known risk alleles for epithelial ovarian cancer (EOC) account for approximately 40% of the heritability for EOC. Copy number variants (CNVs) have not been investigated as EOC risk alleles in a large population cohort. METHODS: Single nucleotide polymorphism array data from 13 071 EOC cases and 17 306 controls of White European ancestry were used to identify CNVs associated with EOC risk using a rare admixture maximum likelihood test for gene burden and a by-probe ratio test. We performed enrichment analysis of CNVs at known EOC risk loci and functional biofeatures in ovarian cancer-related cell types. RESULTS: We identified statistically significant risk associations with CNVs at known EOC risk genes; BRCA1 (PEOC = 1.60E-21; OREOC = 8.24), RAD51C (Phigh-grade serous ovarian cancer [HGSOC] = 5.5E-4; odds ratio [OR]HGSOC = 5.74 del), and BRCA2 (PHGSOC = 7.0E-4; ORHGSOC = 3.31 deletion). Four suggestive associations (P < .001) were identified for rare CNVs. Risk-associated CNVs were enriched (P < .05) at known EOC risk loci identified by genome-wide association study. Noncoding CNVs were enriched in active promoters and insulators in EOC-related cell types. CONCLUSIONS: CNVs in BRCA1 have been previously reported in smaller studies, but their observed frequency in this large population-based cohort, along with the CNVs observed at BRCA2 and RAD51C gene loci in EOC cases, suggests that these CNVs are potentially pathogenic and may contribute to the spectrum of disease-causing mutations in these genes. CNVs are likely to occur in a wider set of susceptibility regions, with potential implications for clinical genetic testing and disease prevention
    corecore